Clever Algorithms: Nature-Inspired Programming Recipes -

Frontier Nerds -

Hacking for Artists -

IBM Developerworks Open Source -

Linux Virtualization Wiki -

NASA History Series Publications -

The Nature of Code -

Practical Common Lisp -

txt2re Regular Expression Generator -

Virt Tools Blog Planet -


Journal of Statistical Software -

Philosophical Transactions of the Royal Society A -

Statistical Analysis and Data Mining: The ASA Data Science Journal -


A more general treatment of the philosophy of physics and the existence of universes (paper, PDF, 39) -

Chiphack for teens: Silicon chip design for teenagers (TR, PDF, 38) -

GNU Autoconf, Automake, and Libtool (book, HTML, NA) -

Random walks and electric networks (book, PDF, 118) -

Power-law distributions in empirical data (paper, PDF, 43) -

The structure and function of complex networks (paper, PDF, 58) -

Complex Systems: A Survey (paper, PDF, 10) -

Power laws, Pareto distributions and Zipf’s law (paper, PDF, 28) -

Big Graph Mining for the Web and Social Media: Algorithms, Anomaly Detection, and Applications (slides, PDF, 254) -[]

Factoring Tensors in the Cloud: A Tutorial on Big Tensor Data Analytics -

Statistical Data Mining Tutorials -

A Gestalt Framework for Virtual Machine Control of Automated Tools -

QEMU, Kernel-based Virtual Machine (KVM), Xen + libvirt -

Containers are the new static binaries -

Enabling FPGAs for the Masses (2014) -

A Collection of Definitions of Intelligence (2007, 12) -

A model-free control strategy for an experimental greenhouse with an application to fault accommodation (2014, 11) -

Hypercomputation: computing more than the Turing machine (2002, 57) -

Information, complexity, brains and reality (Kolmogorov Manifesto) (2007, 68) -

Computational science and re-discovery: open-source implementations of ellipsoidal harmonics for problems in potential theory (2012, 25) -

A Practical Guide to Tensegrity Design, 2nd Ed. (2008, 212) -

Theory of machines through the 20th century (2014, ?) -

Tensegrity frameworks: Static analysis review (2008, 23) -

Solving ordinary differential equations on the Infinity Computer by working with infinitesimals numerically (2013, 25) -

Tensor Decompositions and Applications (2009, 46) -

Dynamical Analogies (1943, 208) -

Discrete Exterior Calculus (2005, 53) -

Why starting from differential equations for computational physics? (32014, 31) -

A reference discretization strategy for the numerical solution of physical field problems (2002, 137) -

The reason for analogies between physical theories (2005, 14) -

Finite element exterior calculus, homological techniques, and applications (2006, 155) -

Finite element exterior calculus: from Hodge theory to numerical stability (2010, 74) -

A finite element exterior calculus framework for the rotating shallow-water equations (2014, 21) -

Differential forms for scientists and engineers (2014, 21) -

The chain collocation method: A spectrally accurate calculus of forms (2014, 21) -

Why starting from differential equations for computational physics? (2014, 31) -

An introduction to Lie group integrators: basics, new developments and applications (2014, 21) -

Mimetic finite difference method (2014, 65) -

The Expressive Web: HTML5 and CSS3 Features -

Julia: A Fresh Approach to Numerical Computing -

Escaping RGBland: Selecting Colors for Statistical Graphs -

A Tour Through the Visualization Zoo: A survey of powerful visualization techniques, from the obvious to the obscure -

Data Visualization (course with PDF slides) -

Machine Learning from Data (course with PDF notes & videos) -

The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2009, 764) -[]

A Course in Machine Learning (2015, 191) -

Is Parallel Programming Hard, And, If So, What Can You Do About It? (2015, 467) -

The compressional beta effect: a source of zonal winds in planets? -

Process-Oriented Parallel Programming with an Application to Data-Intensive Computing -

Stochastic parametrizations and model uncertainty in the Lorenz ’96 system -

A Local Ensemble Kalman Filter for Atmospheric Data Assimilation (Lorenz 96) -

Equivalence of Non-Equilibrium Ensembles and Representation of Friction in Turbulent Flows: The Lorenz 96 Model -

Frontiers of chaotic advection -

Do the Navier-Stokes equations embody all physics in a flow of Newtonian fluids? -

Variational Integrators for Nonvariational Partial Differential Equations -

Compactly Supported Wavelets Derived From Legendre Polynomials: Spherical Harmonic Wavelets -

Tensor computations in computer algebra systems -

The Gődel Phenomena in Mathematics: A Modern View -

P, NP and mathematics: a computational complexity perspective -

A Survey of the Development of Geometry up to 1870 -

[Contemporary Pure] Math is far LESS than the Sum of its [Too Numerous] Parts -

Knowledge-Based Automatic Generation of Linear Algebra Algorithms and Code -

Optimized Composition: Generating Efficient Code for Heterogeneous Systems from Multi-Variant Components, Skeletons and Containers -

Process-Oriented Parallel Programming with an Application to Data-Intensive Computing -

Atmospheric Circulation of Terrestrial Exoplanets -

Atmospheric dynamics of terrestrial exoplanets over a wide range of orbital and atmospheric parameters -

Atmospheric Dynamics of Hot Exoplanets -

Analytical Models of Exoplanetary Atmospheres. I. Atmospheric Dynamics via the Shallow Water System -

Analytical Models of Exoplanetary Atmospheres. II. Radiative Transfer via the Two-stream Approximation -

A Study of Climate on Alien Worlds -

Changing Computing Paradigms Towards Power Efficiency -

Simple, Parallel, High-Performance Virtual Machines for Extreme Computations -

The Physical Basis of Dimensional Analysis -

Dimensional Analysis: A Centenary Update -

Origins of Megalithic Astronomy in Britain -

On the use and significance of isentropic potential-vorticity maps -

Lucidity and science: the deepest connections -

Circumveiloped by Obscuritads. The nature of interpretation in quantum mechanics, hermeneutic circles and physical reality, with cameos of James Joyce and Jacques Derrida -

On Symmetry and Conserved Quantities in Classical Mechanics -

On Symplectic Reduction in Classical Mechanics -

Renormalization for Philosophers -

A Layman’s Guide to M-theory -

State of the Unification Address -

The world in eleven dimensions: a tribute to Oskar Klein -

How fundamental are fundamental constants? -

Action Principle for Hydrodynamics and Thermodynamics including general, rotational flows -

Modern geometry in not-so-high echelons of physics: Case studies -

Clifford algebra, geometric algebra, and applications -

Geometric Algebra: A natural representation of three-space -

Multivector Differential Calculus -

Geometric algebra -

Cartoon computation: quantum-like computing without quantum mechanics -

Geometry and the physics of seasons -

The sundial problem from a new angle -

Traditional vectors as an introduction to geometric algebra -

Cross-language Babel structs—making scientific interfaces more efficient -

Lorenz, Gödel and Penrose: new perspectives on determinism and causality in fundamental physics -[]

Connections between symmetries and conservation laws -

Similarity: generalizations, applications and open problems -[]

Construction of Conservation Laws: How the Direct Method Generalizes Noether’s Theorem -[]

Mobile phone as a platform for numerical simulation -

Symmetry analysis of a system of modified shallow-water equations -

The multiplier method to construct conservative finite difference schemes for ordinary and partial differential equations -

Lecture notes in fluid mechanics: From basics to the millennium problem -

Navier-Stokes Hamiltonian -

On Dynamic Mode Decomposition: Theory and Applications -

Data-Driven Reduction for Multiscale Stochastic Dynamical Systems -

A Data-Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition -

Data Fusion via Intrinsic Dynamic Variables: An Application of Data-Driven Koopman Spectral Analysis -

Applied Koopmanism -

Variants of dynamic mode decomposition: connections between Koopman and Fourier analyses -

Dynamic Mode Decomposition for Large and Streaming Datasets -

Frontiers of Chaotic Advection -

A Unified Framework for Numerical and Combinatorial Computing -

A Unified View of Matrix Factorization Models -

Era of Big Data Processing: A New Approach via Tensor Networks and Tensor Decompositions -

Tensor Numerical Methods for High-dimensional PDEs: Basic Theory and Initial Applications -

Introduction to Tensor Numerical Methods in Scientific Computing (PDF course slides, 238 pp.) -

Numerical operator calculus in higher dimensions -

A literature survey of low-rank tensor approximation techniques -

Algebraic Wavelet Transform via Quantics Tensor Train Decomposition -

Tensor Decompositions for Signal Processing Applications From Two-way to Multiway Component Analysis -

Geometric Algebra Computing (PDF Course Slides) -

Eloquent JavaScript -

How I Start (Erlang, Elixir, Ruby, Go, Haskell, Nim) -

An Absolute Beginner’s Guide to node.js -

The Elements of Statistical Learning -[]

Information Theory, Inference, and Learning Algorithms (2003, 640) -

Gaussian Processes for Machine Learning -

Univariate Probability Distribution Relationships -[]

Associative Arrays: Unified Mathematics for Spreadsheets, Databases, Matrices, and Graphs (2015, 4) -


Fast Intrinsic Mode Decomposition and Filtering of Time Series Data -

The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis -

Intrinsic time-scale decomposition: time–frequency–energy analysis and real-time filtering of non-stationary signals -

Defining a Trend for a Time Series Which Makes Use of the Intrinsic Time-Scale Decomposition -

Comparison between intrinsic time-scale decomposition and Hilbert-Huang transform -

Testing time series irreversibility using complex network methods -

Nonlinear multivariate and time series analysis by neural network methods -[]


A new covariant form of the equations of GFD and their structure-preserving discretization (slides, PDF, 44) -

Automating the solution of PDEs on the sphere and other manifolds in FEniCS 1.2 (2013, 21) -

A standard test case suite for two-dimensional linear transport on the sphere: results from a collection of state-of-the-art schemes (2014, 41) -

A library of benchmark results is provided to facilitate scheme intercomparison and model development. Simple software and data sets are made available to facilitate the process of model evaluation and scheme intercomparison.

A 24-variable low-order coupled ocean–atmosphere model: OA-QG-WS v2 (2014, 14) -

Testing time series irreversibility using complex network methods (2013, 6) -

Inferences on weather extremes and weather-related disasters: a review of statistical methods (2012, 22) -

Constructing Proxy Records from Age models (COPRA) (2012, 15) -

Why could ice ages be unpredictable? (2013, 46) -

A brief history of ice core science over the last 50 yr -

Nonlinear regime shifts in Holocene Asian monsoon variability: potential impacts on cultural change and migratory patterns (2014, 81) -

Disentangling different types of El Niño episodes by evolving climate network analysis -

How complex climate networks complement eigen techniques for the statistical analysis of climatological data -

Available potential energy density for a multicomponent Boussinesq fluid with arbitrary nonlinear equation of state -

A generalized mathematical model of geostrophic adjustment and frontogenesis: uniform potential vorticity -

Geostrophic adjustment in a closed basin with islands -

Geostrophic adjustment with gyroscopic waves: barotropic fluid without the traditional approximation -

Topological selection in stratified fluids: an example from air–water systems -

The virtual power principle in fluid mechanics -

Consistent shallow-water equations on the rotating sphere with complete Coriolis force and topography -

Fully adaptive turbulence simulations based on Lagrangian spatio-temporally varying wavelet thresholding -

New perspectives on superparameterization for geophysical turbulence -

The use of imprecise processing to improve accuracy in weather & climate prediction -

Adaptive wavelet collocation method on the shallow water model -



Climate science in the tropics: waves, vortices and PDEs -

Singular vectors, predictability and ensemble forecasting for weather and climate -

Finite size Lyapunov exponent: review on applications -

The concept of information in physics: an interdisciplinary topical lecture -

Short- and long-term forecast for chaotic and random systems (50 years after Lorenz’s paper) -

The real butterfly effect -

Numerical methods for Hamiltonian PDEs -

Vorticity and symplecticity in Lagrangian fluid dynamics -

A variational derivation of the geostrophic momentum approximation -

The influence of fast waves and fluctuations on the evolution of the dynamics on the slow manifold -

Restricted equilibrium and the energy cascade in rotating and stratified flows -

Ortho-normal quaternion frames, Lagrangian evolution equations and the three-dimensional Euler equations -

The gradient of potential vorticity, quaternions and an orthonormal frame for fluid particles -

The dynamics of the gradient of potential vorticity -

Symmetries in atmospheric sciences -

Lie reduction and exact solutions of vorticity equation on rotating sphere -

Invariant parameterization and turbulence modeling on the beta-plane -

Complete point symmetry group of vorticity equation on rotating sphere -

Tractor beam on the water surface -

A Didactic Approach to Linear Waves in the Ocean -

Identifying Finite-Time Coherent Sets from Limited Quantities of Lagrangian Data -

Quasi-geostrophic modes in the Earth’s fluid core with an outer stably stratified layer -

Generalized Ertel’s theorem and infinite hierarchies of conserved quantities for three-dimensional time-dependent Euler and Navier–Stokes equations -

Analogous formulation of electrodynamics and two-dimensional fluid dynamics -

Enumeration, orthogonality and completeness of the incompressible Coriolis modes in a sphere -

A General Approach for Producing Hamiltonian Numerical Schemes for Fluid Equations -

Continuous and discrete Clebsch variational principles -

Energy- and enstrophy-conserving schemes for the shallow-water equations, based on mimetic finite elements -

A finite element exterior calculus framework for the rotating shallow-water equations -

Automated generation and symbolic manipulation of tensor product finite elements -

Compatible finite element methods for numerical weather prediction -

Causality between time series -

A survey on tidal analysis and forecasting methods for Tsunami detection -

Attraction-Based Computation of Hyperbolic Lagrangian Coherent Structures -

Accurately Estimating the State of a Geophysical System with Sparse Observations: Predicting the Weather -

Flow networks: A characterization of geophysical fluid transport -

Stochastic Climate Theory and Modelling -

The Impact of Oceanic Heat Transport on the Atmospheric Circulation: a Thermodynamic Perspective -

The impact of oceanic heat transport on the atmospheric circulation -

Spectral diagonal ensemble Kalman filters -

Chaotic Lagrangian transport and mixing in the ocean -

LCS Tool: A Computational Platform for Lagrangian Coherent Structures -

Automated detection of coherent Lagrangian vortices in two-dimensional unsteady flows -

Do Finite-Size Lyapunov Exponents Detect Coherent Structures? -

An Alternative Formulation of Lyapunov Exponents for Computing Lagrangian Coherent Structures -

Shallow water equations for large bathymetry variations -

The unity of instantaneous spectral moments and physical moments -

Extracting waves and vortices from Lagrangian trajectories -

Analysis of the 3DVAR Filter for the Partially Observed Lorenz '63 Model -

Stability of Filters for the Navier-Stokes Equation -

Fluctuation-Dissipation: Response Theory in Statistical Physics -

General aspects of the Fluctuation-Dissipation Relation (FDR), and Response Theory are considered. After analyzing the conceptual and historical relevance of fluctuations in statistical mechanics, we illustrate the relation between the relaxation of spontaneous fluctuations, and the response to an external perturbation. These studies date back to Einstein’s work on Brownian Motion, were continued by Nyquist and Onsager and culminated in Kubo’s linear response theory. The FDR has been originally developed in the framework of statistical mechanics of Hamiltonian systems, nevertheless a generalized FDR holds under rather general hypotheses, regardless of the Hamiltonian, or equilibrium nature of the system. In the last decade, this subject was revived by the works on Fluctuation Relations (FR) concerning far from equilibrium systems. The connection of these works with large deviation theory is analyzed. Some examples, beyond the standard applications of statistical mechanics, where fluctuations play a major role are discussed: fluids, granular media, nano-systems and biological systems.

The prediction of future from the past: an old problem from a modern perspective -

About the role of chaos and coarse graining in Statistical Mechanics -

Twenty-five years of multifractals in fully developed turbulence -

An update on the double cascade scenario in two-dimensional turbulence -

Introduction to chaos and diffusion -

Lagrangian drifter dispersion in the southwestern Atlantic Ocean -

Geometric numerical schemes for the KdV equation -

A hierarchy of energy- and flux-budget (EFB) turbulence closure models for stably stratified geophysical flows -

A Statistical Mechanical Approach for the Computation of the Climatic Response to General Forcings -

Entropy Production and Coarse Graining of the Climate Fields in a General Circulation Model -

Towards a General Theory of Extremes for Observables of Chaotic Dynamical Systems -

Mathematical and Physical Ideas for Climate Science -

Geometric invariants of the horizontal velocity gradient tensor and their dynamics in shallow water flow -

Enstrophy bounds and the range of space-time scales in the hydrostatic primitive equations -

A geometric interpretation of coherent structures in Navier-Stokes flows -

Multisymplectic variational integrators and space/time symplecticity -

Hyper-Kaehler geometry and semi-geostrophic theory -

Quaternions and particle dynamics in the Euler fluid equations -[]

Clebsch variational principles in field theories and singular solutions of covariant EPDiff equations -

Generalised semi-geostrophic theory on a sphere -

The gradient of potential vorticity, quaternions and an orthonormal frame for fluid particles -

The dynamics of the gradient of potential vorticity -

Ortho-normal quaternion frames, Lagrangian evolution equations and the three-dimensional Euler equations -

Statistical mechanics of two-dimensional and geophysical flows -

The Equivalence of the Lagrangian-Averaged Navier-Stokes-α Model and the Rational LES model in Two Dimensions -

The Euler-Poincare Equations in Geophysical Fluid Dynamics -

Applications of Poisson Geometry to Physical Problems -

Are there higher-accuracy analogues of semi-geostrophic theory? -

Hamiltonian Balanced Models: Constraints, Slow Manifolds and Velocity Splitting -

IPython Notebooks

IPython Mini-Book Examples -

Why the three biggest positive contributions to reproducible research are the iPython Notebook, knitr, and Galaxy -

The IPython Notebook: Get Close to Your Data with Python and JavaScript -

21 Interactive Plots from matplotlib, ggplot for Python, prettyplotlib, Stack Overflow, and seaborn -

matta: view and scaffold d3.js visualizations in IPython notebooks -

Lectures and materials for the Fall 2014 Skoltech numerical linear algebra course -

Learn Data Science -

Using Iris to Access Data from US-IOOS Models -

Using Iris to access NCEP CFSR 30-year Wave Hindcast -


NumPy in the browser: proof of concept with Numba, LLVM, and emscripten -

Awesome Python: A curated list of awesome Python frameworks, libraries and software -

Intermediate Pythonista -

Fast Non-Standard Data Structures for Python -

Fun With Python Lists -

Importing Python Modules -

Solving Problems with Sets and Comprehensions in Python -

A guide to Python’s function decorators -

How to run external programs from Python and capture their output -

Python Metaprogramming for Mad Scientists and Evil Geniuses -

Advanced Use of Python Decorators and Metaclasses -

Dataframes in Python -

Learn Python Through Public Data Hacking -

Mistakes Developers Make When Using Python for Big Data -

Digital Signal Processing in Python -

Overview of Python Visualization Tools -

Data Science 45-Minute Ipython Notebook Intros -

Must-Watch Videos About Python -

Plotly for Python User Guide -

Let’s Write an LLVM Specializer for Python -

Use pew, not virtualenvwrapper, for Python virtualenvs -

Simple and effective coin segmentation using Python and OpenCV -

Interactive Data Visualization with D3.js, DC.js, Python, and MongoDB -

Getting started with Graphviz and Python -

Natural Language Processing with Python -

Hacker’s Guide to Neural Networks -

Machine Learning for Hackers (Python notebooks of examples in MLFH book) -

Probabilistic Programming and Bayesian Methods for Hackers -

Full Stack Development - Fetching Data, Visualizing With D3, and Deploying With Dokku -[]

An Even Better Lisp Interpreter in Python -

Practical Data Science in Python -

Frequentism and Bayesianism: A Python-driven Primer -

TDD Web Dev With Python (ASCIIDOC format) -

New Section


AMD Core Math Library, or ACML, provides a free set of thoroughly optimized and threaded math routines for HPC, scientific, engineering and related compute-intensive applications. ACML is ideal for weather modeling, computational fluid dynamics, financial analysis, oil and gas applications and more. ACML consists of the following main components:

  • A full implementation of Level 1, 2 and 3 Basic Linear Algebra Subroutines (BLAS), with key routines optimized for high performance on AMD Opteron™ processors. The BLAS level 3 routines will take advantage of heterogeneous computing through OpenCL if detected.

  • A full suite of Linear Algebra (LAPACK) routines. As well as taking advantage of the highly-tuned BLAS kernels, a key set of LAPACK routines has been further optimized to achieve considerably higher performance than standard LAPACK implementations.

  • Beginning version 6 of ACML, a subset of FFTW interfaces are supported for Fourier transform functionality. Heterogeneous compute with GPU/APU and OpenCL is supported through the FFTW interfaces. A comprehensive set of FFTs through ACML specific API (found in version 5 and older) continues to be available in version 6.

  • Random Number Generators in both single- and double-precision.

Active Papers

ActivePapers is a research and development project whose aim is to make computational science more open and more reliable, by making computational reproducible and publishable. It is a file format for storing computations.

An ActivePaper is a file combining datasets and programs working on these datasets in a single package, which also contains a detailed history of which data was produced when, by running which code, and on which machine. It is a complete record of the state of a computational research project that can be shared among collaborators and in the end published as supplementary material to a journal article.


An approximate MAP decoder with Alternating Direction Dual Decomposition. AD3 (Alternating Directions Dual Decomposition) is an LP-MAP decoder for undirected constrained factor graphs. In other words, it is an approximate MAP decoder that retrieves the solution of an LP relaxation of the original problem.

The input is a factor graph, which may contain both soft factors, associated with log-potentials, and hard constraint factors, associated with a logic function. Factors can be dense, sparse, or combinatorial. Specialized factors can be implemented by the practitioner.

The output is the LP-MAP assignment, with a posterior value for each variable. If all variables are integer, the relaxation is tight and the solution is the true MAP. Otherwise, some entries can be in the unit interval. External tools can be used to obtain a valid solution using rounding heuristics. Optionally, a flag can be set that applies a branch-and-bound procedure and retrieves the true MAP (but it can be slow if the relaxation has many fractional components).


Adaptive Hydraulics (AdH) is a modern, multi-dimensional modeling system for saturated and unsaturated groundwater, overland flow, three-dimensional Navier-Stokes flow, and two- or three-dimensional shallow water problems. Developed by the Coastal and Hydraulics Laboratory at the Engineer Research and Development Center in Vicksburg, MS, the 2-dimensional (2D) shallow water module of AdH was released to the public in September 2007.


This site outlines the Aquatic Ecodynamics (AED) modelling library - an open-source community-driven library of model components for simulation of "aquatic ecodynamics" - water quality, habitat and aquatic ecosystem dynamics.

The AED library consists of numerous modules that are designed as individual model ‘components’ able to be configured in a way that facilitates custom aquatic ecosystem conceptualisations – either simple or complex. Users select water quality and ecosystem variables they wish to simulate and then are able to customize connections and dependencies with other modules, including support for easy customisation at an algorithm level how model components operate (e.g. photosynthesis functions, sorption algorithms etc). In general, model components consider the cycling of carbon, nitrogen and phosphorus, and other relevant components such as oxygen, and are able to simulate organisms including different functional groups of phytoplankton and zooplankton, and also organic matter. Modules to support simulation of water column and sediment geochemistry, including coupled kinetic-equilibria, are also included.


The Stochastic Simulation Algorithm (SSA) developed by Gillespie provides a powerful mechanism for exploring the behavior of chemical systems with small species populations or with important noise contributions. Gene circuit simulations for systems biology commonly employ the SSA method, as do ecological applications. This algorithm tends to be computationally expensive, so researchers seek an efficient implementation of SSA. In this program package, the Accelerated Exact Stochastic Simulation Algorithm (AESS) contains optimized implementations of Gillespieʼs SSA that improve the performance of individual simulation runs or ensembles of simulations used for sweeping parameters or to provide statistically significant results.


Akaros is an open source, GPL-licensed operating system for manycore architectures. Our goal is to provide support for parallel and high-performance applications and to scale to a large number of cores.


Albany is an implicit, unstructured grid, finite element code for the solution and analysis of partial differential equations. Albany is the main demonstration application of the AgileComponents software development strategy at Sandia. It is a PDE code that strives to be built almost entirely from functionality contained within reusable libraries (such as Trilinos/STK/Dakota/PUMI). Albany plays a large role in demonstrating and maturing functionality of new libraries, and also in the interfaces and interoperability between these libraries. It also serves to expose gaps in our coverage of capabilities and interface design.

The highlight of Albany is the PDE assembly. The template-based generic programming approach allows developers to just program for residual equations, and all manner of derivatives and polynomial propagations get automatically computed with no development effort. This approach uses Phalanx for rapid and flexible addition of physics, which works closely with Sacado and Stokhos for automatic propagation of derivatives and UQ. The Trilinos Intrepid and Shards packages are used for the local discretization. A second strength of Albany is the demonstration of transformational analysis algorithms. Albany demonstrates the direct use of all Solver/Analysis tools in Trilinos (through Piro, which was developed in Albany) including NOX, LOCA, Rythmos, Stokhos, and all of Dakota. On any problem we not only get a solution, but can also get sensitivities, run optimization problems, and perform uncertainty quantification. All of these approaches can access all of the linear solver options in Trilinos that are exposed by the Stratimikos layer. The third main strength is the early adoption of STK, the sierra toolkit libraries. This includes the mesh database, IO, and mesh adaptation capabilities from stk_rebalance and stk_adapt.


Amahi is software that runs on a dedicated PC as a central computer for your home. It handles your entertainment, storage, and computing needs. You can store, organize and deliver your recorded TV shows, videos and music to media devices in your network. Share them locally or safely around the world. And it’s expandable with a multitude of one-click install apps.


AMD LibM is a software library containing a collection of basic math functions optimized for x86-64 processor based machines. It provides many routines from the list of standard C99 math functions. AMD LibM is a C library, which users can link in to their applications to replace compiler-provided math functions. Generally, programmers access basic math functions through their compiler. But those who want better accuracy or performance than their compiler’s math functions can use this library to help improve their applications. Users can also take advantage of the vector functions in this library. The vector variants can be used to speed up loops and perform math operations on multiple elements conveniently.


AMG2013 is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. It has been derived directly from the BoomerAMG solver in the hypre library, a large linear solver library that is being developed in the Center for Applied Scientific Computing (CASC) at LLNL. The driver provided in the benchmark can build various test problems. The default problem is a Laplace type problem on an unstructured domain with various jumps and an anisotropy in one part.

AMG2013 is written in ISO-C. It is an SPMD code which uses MPI as well as OpenMP. Parallelism is achieved by data decomposition. The driver provided with AMG2013 achieves this decomposition by simply subdividing the grid into logical P x Q x R (in 3D) chunks of equal size. The benchmark was designed to test parallel weak scaling efficiency.

AMG2013 is a highly synchronous code. The communications and computations patterns exhibit the surface-to-volume relationship common to many parallel scientific codes. Hence, parallel efficiency is largely determined by the size of the data "chunks" mentioned above, and the speed of communications and computations on the machine. AMG2013 is also memory-access bound, doing only about 1-2 computations per memory access, so memory-access speeds will also have a large impact on performance.


AMGCL is a C++ header only library for constructing an algebraic multigrid (AMG) hierarchy. AMG is one the most effective methods for solution of large sparse unstructured systems of equations, arising, for example, from discretization of PDEs on unstructured grids [5,6]. The method can be used as a black-box solver for various computational problems, since it does not require any information about the underlying geometry. AMG is often used not as a standalone solver but as a preconditioner within an iterative solver (e.g. Conjugate Gradients, BiCGStab, or GMRES).

AMGCL builds the AMG hierarchy on a CPU and then transfers it to one of the provided backends. This allows for transparent acceleration of the solution phase with help of OpenCL, CUDA, or OpenMP technologies. Users may provide their own backends which enables tight integration between AMGCL and the user code.

Anaconda Accelerate

Accelerate is an add-on to Continuum’s free enterprise Python distribution, Anaconda. It opens up the full capabilities of your GPU or multi-core processor to Python. Accelerate includes two packages that can be added to your Python installation: NumbaPro and MKL Optimizations. MKL Optimizations makes linear algebra, random number generation, Fourier transforms, and many other operations run faster and in parallel. NumbaPro builds fast GPU and multi-core machine code from easy-to-read Python and NumPy code with a Python-to-GPU compiler.

If you are an academic at a degree-granting institution, all of these add-ons are free of charge. Simply click Anaconda Academic License and fill out the form. If your email address ends in .edu or is in our list of approved academic institutions, the license will be automatically sent to the provided email.

Accelerated Computing with Python

Python is one of the fastest growing and most popular programming languages available. However, as an interpreted language, it has been considered too slow for high-performance computing. That has now changed with the release of the NumbaPro Python compiler from Continuum Analytics.

CUDA Python – Using the NumbaPro Python compiler, which is part of the Anaconda Accelerate package from Continuum Analytics, you get the best of both worlds: rapid iterative development and all other benefits of Python combined with the speed of a compiled language targeting both CPUs and NVIDIA GPUs.


Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy. Avoid writing scripts or custom code to deploy and update your applications— automate in a language that approaches plain English, using SSH, with no agents to install on remote systems.

Ansible is a radically simple IT automation system. It handles configuration-management, application deployment, cloud provisioning, ad-hoc task-execution, and multinode orchestration - including trivializing things like zero downtime rolling updates with load balancers.


ANUGA is a Free & Open Source Software (FOSS) package capable of modelling the impact of hydrological disasters such as dam breaks, riverine flooding, storm-surge or tsunamis.

ANUGA is based on the Shallow Water Wave Equation discretised to unstructured triangular meshes using a finite-volumes numerical scheme. A major capability of ANUGA is that it can model the process of wetting and drying as water enters and leaves an area. This means that it is suitable for simulating water flow onto a beach or dry land and around structures such as buildings. ANUGA is also capable of modelling difficult flows involving shock waves and rapidly changing flow speed regimes (transitions from sub critical to super critical flows).


AnyDSL is a framework for the rapid development of domain-specific languages (DSLs). AnyDSL’s main ingredient is AnyDSL’s intermediate representation Thorin. In contrast to other intermediate representations, Thorin features certain abstractions which allow to maintain domain-specific types and control-flow.

As creating a front-end for some language is a complex and time-consuming endeavor, we offer Impala. This is an imperative language which features as a basis well-known imperative constructs. A DSL developer can hijack Impala such that desired domain-specific types and constructs are available in Impala simply by declaring them. The DSL developer just reuses Impala’s infrastructure (lexer, parser, semantic analysis, and code generator). He does not need to develop his own front-end. Even more important: The decision how to implement domain-specific details is postponed to the expert of the target machine.


AMD OpenCL™ Accelerated Parallel Processing (APP) technology is a set of advanced hardware and software technologies that enable AMD graphics processing cores (GPU), working in concert with the system’s x86 cores (CPU), to execute heterogeneously to accelerate many applications beyond just graphics. This enables better balanced platforms capable of running demanding computing tasks faster than ever, and sets software developers on the path to optimize for AMD Accelerated Processing Units (APUs). The AMD APP Software Development Kit (SDK) is a complete development platform created by AMD to allow you to quickly and easily develop applications accelerated by AMD APP technology. The SDK provides samples, documentation, and other materials to quickly get you started leveraging accelerated compute using OpenCL™, Bolt, or C AMP in your C/C application, or Aparapi for your Java application.


ArrayFire is a high performance software library for parallel computing with an easy-to-use API. Its array based function set makes parallel programming simple. ArrayFire’s multiple backends (CUDA, OpenCL and native CPU) make it platform independent and highly portable. A few lines of code in ArrayFire can replace dozens of lines of parallel computing code, saving you valuable time and lowering development costs.


AsciiDoc is a text document format for writing notes, documentation, articles, books, ebooks, slideshows, web pages, man pages and blogs. AsciiDoc files can be translated to many formats including HTML, PDF, EPUB, man page. AsciiDoc is highly configurable: both the AsciiDoc source file syntax and the backend output markups (which can be almost any type of SGML/XML markup) can be customized and extended by the user.

See also Magic-Book-Project.


AsciidocToGo is a full featured portable version of asciidoc that contains the complete toolchain to build html or docbook/latex based PDF documentation out of plain ascii txt files. Just download AsciidocToGo and start writing instead of seaching day or maybe weeks to put together all of the the required software parts.


Asciidoctor is a fast text processor and publishing toolchain for converting AsciiDoc content to HTML5, DocBook 5 (or 4.5) and other formats. The Asciidoctor project is an effort to bring a comprehensive and accessible publishing toolchain, centered around the AsciiDoc syntax, to a growing range of ecosystems, including Ruby, JavaScript and the JVM.

In addition to the standard AsciiDoc syntax, Asciidoctor recognizes additional markup and formatting options, such as font-based icons (e.g., fire) and UI elements (e.g., button:[Save]). Asciidoctor also offers a modern, responsive theme based on Foundation to style the HTML5 output.

In addition to an AsciiDoc processor and a collection of stylesheets, the project provides plugins for Maven, Gradle and Guard and packages for operating systems such as Fedora, Debian and Ubuntu. It also pushes AsciiDoc to evolve by introducing new ideas and innovation and helps promote AsciiDoc through education and advocacy.


Asciidoctor EPUB3 is a set of Asciidoctor extensions for converting AsciiDoc to EPUB3 & KF8/MOBI.


A Gradle plugin that uses Asciidoctor via JRuby to process AsciiDoc source files within the project.


Asciidoctor LaTeX is a set of Asciidoctor extensions for converting AsciiDoc to LaTeX.


A native PDF renderer for AsciiDoc based on Asciidoctor and Prawn.


JavaScript port of Asciidoctor produced by Opal, a Ruby to JavaScript cross compiler.


DocGist is a URL proxy tool that converts AsciiDoc documents fetched from Gists, GitHub repositories, Dropbox folders and other sources to HTML. The conversion to HTML is performed in the browser (client-side) using the Asciidoctor.js JavaScript library. DocGist can render documents located anywhere, as long as the host permits cross-domain access.


The MPLW is Matplotlib (MPL) wrapper, which can work as AsciiDoc filter. Using this filter you can generate plots from inline matplotlib scripts.


A complete editor for structured text documents with proofreading features. RTextDoc is designed for typesetting professional research papers using LaTeX that are heavy on mathematics and images. In addition, it is designed for writing notes, books, ebooks, slideshows, web pages, man pages and blogs using AsciiDoc mark-up language. RTextDoc also supports DocBook.


asciinema is a free and open source solution for recording the terminal sessions and sharing them on the web. When you run asciinema rec in your terminal the recording starts, capturing all output that is being printed to your terminal while you’re issuing the shell commands. When the recording finishes (by hitting Ctrl-D or typing exit) then the captured output is uploaded to website and prepared for playback on the web.


All three methods have been implemented in the new MAPLE package ASP (Automated Symmetry Package) which is an add-on to the MAPLE symmetry package DESOLVII (Vu, Jefferson and Carminati (2012) [25]). To our knowledge, this is the first computer package to automate all three methods of determining approximate symmetries for differential systems. Extensions to the theory have also been suggested for the third method and which generalise the first method to systems of differential equations. Finally, a number of approximate symmetries and corresponding solutions are compared with results in the literature.


Assimulo is a simulation package for solving ordinary differential equations. It is written in the high-level programming language Python and combines a variety of different solvers written in FORTRAN, C and even Python via a common high-level interface. The primary aim of Assimulo is not to develop new integration algorithms. The aim is to provide a high-level interface for a wide variety of solvers, both new and old, solvers of industrial standards as well as experimental solvers. The aim is to allow comparison of solvers for a given problem without the need to define the problem in a number of different programming languages to accommodate the different solvers.


A text editor for the 21st century.


Authorea is the collaborative platform for research. Write and manage your technical documents in one place.

Azimuth Project

The Azimuth Project is an international collaboration to create a focal point for scientists and engineers interested in saving the planet. Our goal is to make clearly presented, accurate information on the relevant issues easy to find, and to help people work together on our common problems.


Bulk Data Mover (BDM) is a scalable data transfer management tool for GridFTP? transfer protocol. The goal is to manage as much as 1+ PB with millions of files transfers reliably.


BDMPI is a message passing library and associated runtime system for developing out-of-core distributed computing applications for problems whose aggregate memory requirements exceed the amount of memory that is available on the underlying computing cluster. BDMPI is based on the Message Passing Interface (MPI) and provides a subset of MPI’s API along with some extensions that are designed for BDMPI’s memory and execution model.

A BDMPI-based application is a standard memory-scalable parallel MPI program that was developed assuming that the underlying system has enough computational nodes to allow for the in-memory execution of the computations. This program is then executed using a sufficiently large number of processes so that the per-process memory fits within the physical memory available on the underlying computational node(s). BDMPI maps one or more of these processes to the computational nodes by relying on the OS’s virtual memory management to accommodate the aggregate amount of memory required by them. BDMPI prevents memory thrashing by coordinating the execution of these processes using node-level co-operative multi-tasking that limits the number of processes that can be running at any given time. This ensures that the currently running process(es) can establish and retain memory residency and thus achieve efficient execution. BDMPI exploits the natural blocking points that exist in MPI programs to transparently schedule the co-operative execution of the different processes. In addition, BDMPI’s implementation of MPI’s communication operations is done so that to maximize the time over which a process can execute between successive blocking points. This allows it to amortize the cost of loading data from disk over the maximal amount of computations that can be performed.

Since BDMPI is based on the standard MPI library, it also provides a framework that allows the automated out-of-core execution of existing MPI applications. BDMPI is implemented in such a way so that to be a drop-in replacement of existing MPI implementations and allow existing codes that utilize the subset of MPI functions implemented by BDMPI to compile unchanged.


DataMover-Lite (DML) is a simple file transfer tool with graphical user interface which supports multi-protocol data movement.DML is available in both webstart and standalone version. Currently, DML supports http, https, ftp, gridftp, lahfs and scp. For GridFTP, DML also supports directory browsing and transferring.


Beaker is a notebook-style development environment for working interactively with large and complex datasets. Its plugin-based architecture allows you to switch between languages or add new ones with ease, ensuring that you always have the right tool for any of your analysis and visualization needs.


A high-performance parallel file system from the Fraunhofer Center for High Performance Computing. BeeGFS is a pure software solution for scale-out parallel network-accessible storage, developed with a strong focus on performance and designed for very easy installation and management. If I/O intensive workloads are your problem, BeeGFS is the solution.

BeeGFS provides a common file system for shared access to multiple clients and transparently spreads user data across multiple servers. By increasing the number of servers and/or disks in the system, you can simply scale performance and capacity of the file system to the level that you need.


BeStMan is a full implementation of SRM v2.2, developed by Lawrence Berkeley National Laboratory, for disk based storage systems and mass storage systems such as HPSS. End users may have their own personal BeStMan that manages and provides an SRM interface to their local disks or storage systems. It works on top of existing disk-based unix file system, and has been reported so far to work on file systems such as NFS, PVFS, AFS, GFS, GPFS, PNFS, and Lustre. It also works with any existing file transfer service, such as gsiftp, http, https and ftp. It requires the minimal administrative efforts on the deployment and maintenance.

BID Data Project

The BID Data Suite is a collection of hardware, software and design patterns that enable fast, large-scale data mining at very low cost. The software consists of two parts:

  • BIDMat, an interactive matrix library that integrates CPU and GPU acceleration and novel computational kernels.

  • BIDMach, a machine learning system that includes very efficient model optimizers and mixing strategies.

BIDMach is an interactive environment designed to make it extremely easy to build and use machine learning models. BIDMach includes core classes that take care of managing data sources, optimization and distributing data over CPUs or GPUs. It’s very easy to write your own models by generalizing from the models already included in the Toolkit.


Virtual large arrays and lazy evaluation.


BigView allows for interactive panning and zooming of images of arbitrary size on desktop PCs running Linux. Additionally, it can work in a multi-screen environment where multiple PCs cooperate to view a single, large image. Using this software, one can explore — on relatively modest machines — images such as the Mars Orbiter Camera mosaic [92,160×33,280 pixels].

The images must be first converted into “paged” format, where the image is stored in 256×256 “pages” to allow rapid movement of pixels into texture memory. The format contains an “image pyramid”: a set of scaled versions of the original image. Each scaled image is 1/2 the size of the previous, starting with the original down to the smallest, which fits into a single 256×256 page.


A repository for Conda binaries, amongst other things.

A repository for Conda binaries, amongst other things.

Rich Signell’s Binstar -


Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) is a package, written in C and MATLAB/OCTAVE, that includes an eigensolver implemented with the Locally Optimal Block Preconditioned Conjugate Gradient Method (LOBPCG). Its main features are: a matrix-free iterative method for computing several extreme eigenpairs of symmetric positive generalized eigenproblems; a user-defined symmetric positive preconditioner; robustness with respect to random initial approximations, variable preconditioners, and ill-conditioning of the stiffness matrix; and apparently optimal convergence speed.

BLOPEX supports parallel MPI-based computations. BLOPEX is incorporated in the HYPRE package and is available as an external block to the PETSc package. SLEPc and PHAML have interfaces to call BLOPEX eigensolvers.


A blocking, shuffling and loss-less compression library. Blosc is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call. Blosc is the first compressor (that I’m aware of) that is meant not only to reduce the size of large datasets on-disk or in-memory, but also to accelerate memory-bound computations (which is typical in vector-vector operations).


Command line interface to and serialization format for Blosc, a high performance, multi-threaded, blocking and shuffling compressor. Uses python-blosc bindings to interface with Blosc. Also comes with native support for efficiently serializing and deserializing Numpy arrays.


Web sites are made of lots of things — frameworks, libraries, assets, utilities, and rainbows. Bower manages all these things for you.

Bower works by fetching and installing packages from all over, taking care of hunting, finding, downloading, and saving the stuff you’re looking for. Bower keeps track of these packages in a manifest file, bower.json. How you use packages is up to you. Bower provides hooks to facilitate using packages in your tools and workflows.

Bower is optimized for the front-end. Bower uses a flat dependency tree, requiring only one version for each package, reducing page load to a minimum.

A very useful thing is the search engine for packages that can be installed by Bower.


BRL-CAD is a powerful cross-platform open source solid modeling system that includes interactive geometry editing, high-performance ray-tracing for rendering and geometric analysis, image and signal-processing tools, a system performance analysis benchmark suite, libraries for robust geometric representation, with more than 20 years of active development.


Very efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication (among and within files, including virtual machine images).


Calaos is a free software project (GPLv3) that lets you control and monitor your home. You can easily install and use it to transform your home into a smart home.


It is already common for simulations to discard most of what they compute in order to minimize time spent on I/O. As we enter the exascale age the problem of scarce I/O capability continues to grow. Since storing data is no longer viable for many simulation applications, data analysis and visualization must now be performed in situ with the simulation to ensure that it is running smoothly and to fully understand the results that the simulation produces. Catalyst is a light-weight version of the ParaView server library that is designed to be directly embedded into parallel simulation codes to perform in situ analysis at run time.


A C and Fortran Interface to access Climate and NWP model Data. Supported data formats are GRIB, netCDF, SERVICE, EXTRA and IEG.


Software that enables our collaborators to easily harness large scale distributed systems such as clusters, clouds, and grids. We perform fundamental computer science research in that enables new discoveries through computing in fields such as physics, chemistry, bioinformatics, biometrics, and data mining. The tools are:

  • Parrot - Parrot is a tool for attaching existing programs to remote I/O systems through the filesystem interface. Parrot "speaks" a variety of remote I/O services include HTTP, FTP, GridFTP, iRODS, HDFS, XRootD, GROW, and Chirp on behalf of ordinary programs.

  • Chirp - A user-level file system for collaboration across distributed systems such as clusters, clouds, and grids. Chirp allows ordinary users to discover, share, and access storage, whether within a single machine room or over a wide area network.

  • Makeflow - A workflow engine for executing large complex workflows on clusters, clouds, and grids. Makeflow is very similar to traditional Make, so if you can write a Makefile, then you can write a Makeflow.

  • Work Queue - A framework for building large master-worker applications that span many computers including clusters, clouds, and grids. Work Queue applications are written in C, Perl, or Python using a simple API that allows users to define tasks, submit them to the queue, and wait for completion. Tasks are executed by a standard worker process that can run on any available machine. Each worker calls home to the master process, arranges for data transfer, and executes the tasks. The system handles a wide variety of failures, allowing for dynamically scalable and robust applications.

  • SAND - A set of modules for genome assembly that are built atop the Work

  • Queue platform for large-scale distributed computation on clusters, clouds,

  • or grids.


A large tool set for working on climate and NWP model data. NetCDF 3/4, GRIB 1/2 including SZIP and JPEG compression, EXTRA, SERVICE and IEG are supported as IO-formats. Apart from that CDO can be used to analyse any kind of gridded data not related to climate science. CDO has very small memory requirements and can process files larger than the physical memory.

configure --enable-cdi-lib --with-fftw3 --with-jasper=/usr/lib64
--with-libxml2=yes --with-udunits2=/usr/lib64 --with-curl=/usr/lib64
--with-proj=/usr/lib64 --with-netcdf=yes --with-hdf5=yes --with-szlib=yes
--with-threads=yes --with-grib-api=yes


configure: CDO is configured with the following options:
   "CC"                 : "gcc -std=gnu99",
   "CPP"                : "gcc -E",
   "CPPFLAGS"           : "-I/usr/lib64/include -I/usr/lib64/include -I/usr/lib64/include -I/usr/lib64/include -I/usr/include/libxml2",
   "CFLAGS"             : "-g -O2 -fopenmp ",
   "LDFLAGS"            : "-L/usr/lib64/lib -L/usr/lib64/lib  -L/usr/lib64/lib -L/usr/lib64/lib",
   "LIBS"               : "-lxml2 -ludunits2 -lcurl -lproj -lfftw3 -lgrib_api -ljasper -lnetcdf -lhdf5_hl -lhdf5 -lsz -lz  -lm ",
   "FCFLAGS"            : "",
   "INCLUDES"           : "@INCLUDES@",
   "LD"                 : "/usr/bin/ld -m elf_x86_64",
   "NM"                 : "/usr/bin/nm -B",
   "AR"                 : "ar",
   "AS"                 : "as",
   "DLLTOOL"            : "false",
   "OBJDUMP"            : "objdump",
   "STRIP"              : "strip",
   "RANLIB"             : "ranlib",
   "INSTALL"            : "/usr/bin/install -c",
   "cdi"                : {
     "enable_cdi_lib" : true
  "threads"    : {
    "lib"      : "",
    "include"  : ""
  "zlib"       : {
    "lib"      : " -lz",
  "szlib"      : {
    "lib"      : " -lsz",
    "include"  : ""
  "hdf5"       : {
    "lib"      : " -lhdf5",
    "include"  : ""
  "netcdf"     : {
    "lib"      : " -lnetcdf",
    "include"  : ""
  "udunits2"   : {
    "lib"      : " -L/usr/lib64/lib -ludunits2",
    "include"  : " -I/usr/lib64/include"
  "proj"       : {
    "lib"      : " -L/usr/lib64/lib -lproj",
    "include"  : " -I/usr/lib64/include"
  "USER_NAME"          : "baum",
  "HOST_NAME"          : "max",
  "SYSTEM_TYPE"        : "x86_64-unknown-linux-gnu"


Cdo{rb,py} allows you to use CDO in the context of Python and Ruby as if it would be a native library.

CDSC Mapper ~~~

The CDSC Mapper is a compiler package for heterogeneous mapping on various targets such as multi-core CPUs, GPUs and FPGAs. The objective is to provide the user with a complete compilation platform to ease the programming of complex heterogeneous devices, such as a Convey HC1-ex machine. The architecture of the compiler is based on a collection of production-quality compilers such as GNU GCC, Nvidia GCC and LLVM; two open-source compilation infrastructures on top of which development has been performed: the LLNL ROSE compiler and the LLVM project; and a collection of research compilers and runtime such as CnC-HC, PolyOpt and SDSLc.


The CEOP Satellite Data Server is actually a gateway with an OPeNDAP front end and the ability to access data via the OGC WCS protocol on the backend. Though originally developed for the Coordinated Enhanced Observing Period (CEOP) effort, it can be used with other WCS servers. It is implemented as a plug-in handler to the Hyrax server distributed by OPeNDAP


Cetus is a compiler infrastructure for the source-to-source transformation of software programs. It currently supports ANSI C. Since its creation in 2004, it has grown to over 80,000 lines of Java code, has been made available publicly on the web, and has become a basis for several research projects.

CFD Utilities

The CFD Utility Software Library (previously known as the Aerodynamics Division Software Library at NASA Ames Research Center) contains nearly 30 libraries of generalized subroutines and close to 100 applications built upon those libraries. These utilities have accumulated during four decades or so of software development in the aerospace field.

All are written in Fortran 90 or FORTRAN 77 with potential reuse in mind. The only exception is the C translations of a dozen or so numerics routines grouped as C_utilities.

David Saunders and Robert Kennelly are the primary authors, but miscellaneous contributions by others are gratefully acknowledged.

See 1-line summaries of the libraries and applications under the Files menu. Each library folder also contains 1-line summaries of the grouped subroutines, while each application folder contains READMEs adapted from the main program headers. NASA permission to upload actual software was granted on Jan. 24, 2014.


An I/O library for climate models, named CFIO(Climate Fast I/O).

CFIO provides the same interface and feature as PnetCDF, and adopts an I/O forwarding technique to provide automatic overlapping of I/O with computing. CFIO performs better than PnetCDF in terms of decreasing the overall running time of the program.


CF-compliant NetCDF for radial data.


The CGAL Bindings project allows to use some packages of CGAL, the Computational Algorithms Library, in languages other than C++, as for example Java and Python. The bindings are implemented with SWIG.


An emerging parallel programming language whose design and development are being led by Cray Inc. in collaboration with academia, computing centers, and industry. Chapel’s goal is to make parallel programming more productive, from high-end supercomputers to commodity clusters and multicore desktops and laptops. Chapel is being developed in an open-source manner at SourceForge and is released under the BSD license.

Chapel supports a multithreaded execution model via high-level abstractions for data parallelism, task parallelism, concurrency, and nested parallelism. Chapel’s locale type enables users to specify and reason about the placement of data and tasks on a target architecture in order to tune for locality. Chapel supports global-view data aggregates with user-defined implementations, permitting operations on distributed data structures to be expressed in a natural manner. In contrast to many previous higher-level parallel languages, Chapel is designed around a multiresolution philosophy, permitting users to initially write very abstract code and then incrementally add more detail until they are as close to the machine as their needs require. Chapel supports code reuse and rapid prototyping via object-oriented design, type inference, and features for generic programming.

Chapel was designed from first principles rather than by extending an existing language. It is an imperative block-structured language, designed to be easy to learn for users of C, C++, Fortran, Java, Python, Matlab, and other popular languages. While Chapel builds on concepts and syntax from many previous languages, its parallel features are most directly influenced by ZPL, High-Performance Fortran (HPF), and the Cray MTA™/Cray XMT™ extensions to C and Fortran.


A high-performance language interoperability tool that generates Babel-compatible bindings for the Chapel programming language. For details on using the command-line tool, please consult the BRAID man page and the Babel user’s guide.


Circos is a software package for visualizing data and information. It visualizes data in a circular layout — this makes Circos ideal for exploring relationships between objects or positions. There are other reasons why a circular layout is advantageous, not the least being the fact that it is attractive.

Circos is ideal for creating publication-quality infographics and illustrations with a high data-to-ink ratio, richly layered data and pleasant symmetries. You have fine control each element in the figure to tailor its focus points and detail to your audience.


CKAN is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data. CKAN is aimed at data publishers (national and regional governments, companies and organizations) wanting to make their data open and available.


This extension contains plugins that add geospatial capabilities to CKAN.


CL21 is an experimental project redesigning Common Lisp.


ClimatePipes uses a web-based application platform due to its widespread support on mainstream operating systems, ease-of-use, and inherent collaboration support. The front-end of ClimatePipes uses HTML5 (WebGL, CSS3) to deliver state-of-the-art visualization and to provide a best-in-class user experience. The back-end of the ClimatePipes is built using the Visualization Toolkit (VTK), Climate Data Analysis Tools (CDAT), and other climate and geospatial data processing tools such as GDAL and PROJ4.


This repository houses the code for the OpenCL™ BLAS portion of clMath. The complete set of BLAS level 1, 2 & 3 routines is implemented. Please see Netlib BLAS for the list of supported routines. In addition to GPU devices, the library also supports running on CPU devices to facilitate debugging and multicore programming. APPML 1.10 is the most current generally available pre-packaged binary version of the library available for download for both Linux and Windows platforms.

The primary goal of clBLAS is to make it easier for developers to utilize the inherent performance and power efficiency benefits of heterogeneous computing. clBLAS interfaces do not hide nor wrap OpenCL interfaces, but rather leaves OpenCL state management to the control of the user to allow for maximum performance and flexibility. The clBLAS library does generate and enqueue optimized OpenCL kernels, relieving the user from the task of writing, optimizing and maintaining kernel code themselves.


CLFORTRAN is an open source (LGPL) Fortran module, designed to provide direct access to GPU, CPU and accelerator based computing resources available by the OpenCL standard.


clMAGMA is an OpenCL port of MAGMA. It supports AMD GPUs. The clMAGMA library dependancies, in particular optimized GPU OpenCL BLAS and CPU optimized BLAS and LAPACK for AMD hardware, can be found in the AMD Accelerated Parallel Processing Math Libraries (APPML).


clMath is the open-source project for OpenCL based BLAS and FFT libraries. The complete set of BLAS level 1, 2 & 3 routines is implemented.


Clojure is a dynamic programming language that targets the Java Virtual Machine (and the CLR, and JavaScript). It is designed to be a general-purpose language, combining the approachability and interactive development of a scripting language with an efficient and robust infrastructure for multithreaded programming. Clojure is a compiled language - it compiles directly to JVM bytecode, yet remains completely dynamic. Every feature supported by Clojure is supported at runtime. Clojure provides easy access to the Java frameworks, with optional type hints and type inference, to ensure that calls to Java can avoid reflection.

Clojure is a dialect of Lisp, and shares with Lisp the code-as-data philosophy and a powerful macro system. Clojure is predominantly a functional programming language, and features a rich set of immutable, persistent data structures. When mutable state is needed, Clojure offers a software transactional memory system and reactive Agent system that ensure clean, correct, multithreaded designs.


ClojureScript is a new compiler for Clojure that targets JavaScript. It is designed to emit JavaScript code which is compatible with the advanced compilation mode of the Google Closure optimizing compiler.


Leiningen is the easiest way to use Clojure. With a focus on project automation and declarative configuration, it gets out of your way and lets you focus on your code.


CLyther is a Python tool similar to Cython and PyPy. CLyther is a just-in-time specialization engine for OpenCL. The main entry points for CLyther are its clyther.task and clyther.kernel decorators. Once a function is decorated with one of these the function will be compiled to OpenCL when called.

CLyther is a Python language extension that makes writing OpenCL code as easy as Python itself. CLyther currently only supports a subset of the Python language definition but adds many new features to OpenCL.

CLyther exposes both the OpenCL C library as well as the OpenCL language to python.


CLUTO is a software package for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. CLUTO is well-suited for clustering data sets arising in many diverse application areas including information retrieval, customer purchasing transactions, web, GIS, science, and biology.

CLUTO’s distribution consists of both stand-alone programs and a library via which an application program can access directly the various clustering and analysis algorithms implemented in CLUTO.


gCLUTO is a cross-platform graphical application for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. gCLUTO is build on-top of the CLUTO clustering library.


wCLUTO is a web-enabled data clustering application that is designed for the clustering and data-analysis requirements of gene-expression analysis. wCLUTO is also built on top of the CLUTO clustering library. Users can upload their datasets, select from a number of clustering methods, perform the analysis on the server, and visualize the final results.


The "Climate Model Output Rewriter" (CMOR, pronounced "Seymour") comprises a set of C-based functions, with bindings to both Python and FORTRAN 90, that can be used to produce CF-compliant netCDF files that fulfill the requirements of many of the climate community’s standard model experiments. These experiments are collectively referred to as MIP’s and include, for example, AMIP, CMIP, CFMIP, PMIP, APE, and IPCC scenario runs. The output resulting from CMOR is "self-describing" and facilitates analysis of results across models.

Much of the metadata written to the output files is defined in MIP-specific tables, typically made available from each MIP’s web site. CMOR relies on these tables to provide much of the metadata that is needed in the MIP context, thereby reducing the programming effort required of the individual MIP contributors.


The software package to be disclosed,, is a front end to an existing free software package, CMOR2 (Climate Model Output Rewriter), written by Lawrence Livermore National Laboratory (LLNL), and reads in a multitude of standard data formats, such as netcdf3, netcdf4, Grads control files, Matlab data files or a list of netcdf files, and converts the data into the CMIP5 data format to allow publication on the Earth System Grid Federation (ESGF) data node.


COCO (COmparing Continuous Optimisers) is a platform for systematic and sound comparisons of real-parameter global optimisers. COCO provides benchmark function testbeds and tools for processing and visualizing data generated by one or several optimizers. The COCO platform has been used for the Black-Box-Optimization-Benchmarking (BBOB) workshops that took place during the GECCO conference in 2009, 2010, 2012, and 2013.


Code::Blocks is a free C, C++ and Fortran IDE built to meet the most demanding needs of its users. It is designed to be very extensible and fully configurable. An IDE with all the features you need, having a consistent look, feel and operation across platforms. Built around a plugin framework, Code::Blocks can be extended with plugins. Any kind of functionality can be added by installing/coding a plugin. For instance, compiling and debugging functionality is already provided by plugins.


Code_Saturne solves the Navier-Stokes equations for 2D, 2D-axisymmetric and 3D flows, steady or unsteady, laminar or turbulent, incompressible or weakly dilatable, isothermal or not, with scalars transport if required.

Several turbulence models are available, from Reynolds-Averaged models to Large-Eddy Simulation models. In addition, a number of specific physical models are also available as "modules": gas, coal and heavy-fuel oil combustion, semi-transparent radiative transfer, particle-tracking with Lagrangian modeling, Joule effect, electrics arcs, weakly compressible flows, atmospheric flows, rotor/stator interaction for hydraulic machines.


The numerical solution of partial differential equations using the finite element method is one of the key applications of high performance computing. Local assembly is its characteristic operation. This entails the execution of a problem-specific kernel to numerically evaluate an integral for each element in the discretized problem domain. Since the domain size can be huge, executing efficient kernels is fundamental. Their op- timization is, however, a challenging issue. Even though affine loop nests are generally present, the short trip counts and the complexity of mathematical expressions make it hard to determine a single or unique sequence of successful transformations. Therefore, we present the design and systematic evaluation of COF- FEE, a domain-specific compiler for local assembly kernels. COFFEE manipulates abstract syntax trees generated from a high-level domain-specific language for PDEs by introducing domain-aware composable optimizations aimed at improving instruction-level parallelism, especially SIMD vectorization, and register locality. It then generates C code including vector intrinsics.


A Pythonic package for combinatorics. Combi lets you explore spaces of permutations and combinations as if they were Python sequences, but without generating all the permutations/combinations in advance. It lets you specify a lot of special conditions on these spaces. It also provides a few more classes that might be useful in combinatorics programming.


COMCOT (Cornell Multi-grid Coupled Tsunami Model) is a tsunami modeling package, capable of simulating the entire lifespan of a tsunami, from its generation, propagation and runup/rundown in coastal regions.

Waves can be generated via incident wave maker, fault model, landslide, or even customized profile. Flexible nested grid setup allows for the balance between accuracy and efficiency.


ConicBundle is a callable library for C/C++ that implements a bundle method for minimizing the sum of convex functions that are given by first order oracles or arise from Lagrangean relaxation of particular conic linear programs.

Context Free Art

Context Free is a program that generates images from written instructions called a grammar. The program follows the instructions in a few seconds to create images that can contain millions of shapes. Chris Coyne created a small language for design grammars called CFDG. These grammars are sets of non-deterministic rules to produce images. The images are surprisingly beautiful, often from very simple grammars. Context Free is a full graphical environment for editing, rendering, and exploring CFDG design grammars.

See also Structure Synth.


Contiki is an open source operating system for the Internet of Things. Contiki connects tiny low-cost, low-power microcontrollers to the Internet. Contiki provides powerful low-power Internet communication. Contiki supports fully standard IPv6 and IPv4, along with the recent low-power wireless standards: 6lowpan, RPL, CoAP. With Contiki’s ContikiMAC and sleepy routers, even wireless routers can be battery-operated.


A Fortran 90 library that provides functions to manage grids and aribirary sets of points, including interpolation and mapping between different coordinate systems.


The CEDA OGC Web Services framework (COWS) is a Python software framework developed at the Centre of Environmental Data Archival for implementing Open Geospacial Consortium web service standards.


The toolbox contains MATLAB® routines for computing recurrence plots and related problems.


Cubica is a toolkit for efficient finite element simulations of deformable bodies containing both geometric and material non-linearities. Its main feature is its use of subspace methods, also known as dimensional model reduction or reduced order methods, which can accelerate simulations by several orders of magnitude.


NVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks. It emphasizes performance, ease-of-use, and low memory overhead. NVIDIA cuDNN is designed to be integrated into higher-level machine learning frameworks, such as UC Berkeley’s popular Caffe software. The simple, drop-in design allows developers to focus on designing and implementing neural net models rather than tuning for performance, while still achieving the high performance modern parallel computing hardware affords.


Cupid is a development and training environment for models that use the Earth System Modeling Framework (ESMF) and National Unified Operational Capability (NUOPC) Layer infrastructure. Cupid is implemented as a plug-in for the widely used Eclipse Integrated Development Environment (IDE). Together, Cupid and Eclipse form an accessible, appealing training environment that makes it easier and faster to build NUOPC-based applications.


A better microcontroller IDE.


D4M is a breakthrough in computer programming that combines the advantages of five distinct processing technologies (sparse linear algebra, associative arrays, fuzzy algebra, distributed arrays, and triple-store/NoSQL databases such as Hadoop HBase and Apache Accumulo) to provide a database and computation system that addresses the problems associated with Big Data. D4M significantly improves search, retrieval, and analysis for any business or service that relies on accessing and exploiting massive amounts of digital data. Evaluations have shown D4M to simultaneously increase computing performance and to decrease the effort required to build applications by as much as 100x. Improved performance translates into faster, more comprehensive services provided by companies involved in healthcare, Internet search, network security, and more. Less, and simplified, coding reduces development times and costs. Moreover, the D4M layered architecture provides a robust environment that is adaptable to various databases, data types, and platforms.


Damaris is a middleware for I/O and data management targeting large-scale, MPI-based HPC simulations. It initially proposed to dedicate cores for asynchronous I/O in multicore nodes of recent HPC platforms, with an emphasis on ease of integration in existing simulation, efficient resource usage (with the use of shared memory) and simplicity of extension through plugins.

Over the years, Damaris has evolved into a more elaborate system, providing the possibility to use dedicated cores or dedicated nodes to data processing and I/O. It proposes a seamless connection to the VisIt software to enable in situ visualization with minimum impact on run time. Damaris provides an extremely simple API and can be easily integrated in existing large-scale simulations.


The goal of Damsel project is to enable Exascale computational science aplications to interact conveniently and efficiently with storage through abstractions that match their data models.


Dart is a cohesive, scalable platform for building apps that run on the web (where you can use Polymer) or on servers (such as with Google Cloud Platform). Use the Dart language, libraries, and tools to write anything from simple scripts to full-featured apps.


DART is a community facility for ensemble DA developed and maintained by the Data Assimilation Research Section (DAReS) at the National Center for Atmospheric Research (NCAR). DART provides modelers, observational scientists, and geophysicists with powerful, flexible DA tools that are easy to implement and use and can be customized to support efficient operational DA applications. DART is a software environment that makes it easy to explore a variety of data assimiliation methods and observations with different numerical models and is designed to facilitate the combination of assimilation algorithms, models, and real (as well as synthetic) observations to allow increased understanding of all three. DART includes extensive documentation, a comprehensive tutorial, and a variety of models and observation sets that can be used to introduce new users or graduate students to ensemble DA. DART also provides a framework for developing, testing, and distributing advances in ensemble DA to a broad community of users by removing the implementation-specific peculiarities of one-off DA systems.

DART employs a modular programming approach to apply an Ensemble Kalman Filter which nudges the underlying models toward a state that is more consistent with information from a set of observations. Models may be swapped in and out, as can different algorithms in the Ensemble Kalman Filter. The method requires running multiple instances of a model to generate an ensemble of states. A forward operator appropriate for the type of observation being assimilated is applied to each of the states to generate the model’s estimate of the observation.


DASH is a C++ Template Library for Distributed Data Structures with Support for Hierarchical Locality for HPC and Data-Driven Science.

Exascale systems are scheduled to become available in 2018-2020 and will be characterized by extreme scale and a multilevel hierarchical organization. Efficient and productive programming of these systems will be a challenge, especially in the context of data-intensive applications. Adopting the promising notion of Partitioned Global Address Space (PGAS) programming the DASH project develops a data-structure oriented C template library that provides hierarchical PGAS-like abstractions for important data containers (multidimensional arrays, lists, hash tables, etc.) and allows a developer to control (and explicitly take advantage of) the hierarchical data layout of global data structures. In contrast to other PGAS approaches such as UPC, DASH does not propose a new language or require compiler support to realize global address space semantics. Instead, operator overloading and other advanced C features are used to provide the semantics of data residing in a global and hierarchically partitioned address space based on a runtime system with one-sided messaging primitives provided by MPI or GASNet. As such, DASH can co-exist with parallel programming models already in widespread use (like MPI) and developers can take advantage of DASH by incrementally replacing existing data structures with the implementation provided by DASH. Efficient I/O directly to and from the hierarchical structures and DASH-optimized algorithms such as map-reduce are also part of the project. Two applications from molecular dynamics and geoscience are driving the project and are adapted to use DASH in the course of the project.


Dat is an open source project that provides a streaming interface between every file format and data storage backend.


DataHub is a unified, managed, collaborative platform for making data-processing easy. Relational databases have limited support for data collaboration, where teams collaboratively curate and analyze large datasets. Inspired by software version control systems like git, we propose (a) a dataset version control system, giving users the ability to create, branch, merge, difference and search large, divergent collections of datasets, and (b) a platform, DataHub, that gives users the ability to perform collaborative data analysis building on this version control system.


DataMPI is an efficient, flexible, and productive communication library, which provides a set of key-value pair based communication interfaces that extends MPI for Big Data. Through utilizing the efficient communication technologies in the High-Performance Computing area, DataMPI can speedup the emerging data intensive computing applications. DataMPI takes a step in bridging the two fields of HPC and Big Data.

DataMPI can support multiple modes for various Big Data Computing applications, including Common, MapReduce, Streaming, and Iteration. The current version implements the functionalities and features of the Common mode, which aims to support the single program, multiple data (SPMD) applications. The remaining modes will be released in the future.

The current implementation of DataMPI is extending mpiJava. We also integrate some features from Hadoop under Apache License 2.0. The current evaluations of DataMPI use MVAPICH2 as the backend. DataMPI also supports other MPI implementations, such as MPICH2.


The DaviX project aims to provide a solution for optimized remote I/O, data management and large collections of file management over the WebDav (link is external), Amazon S3 (link is external) and HTTP (link is external) protocols. Davix is Multi-plateform, Open Source and is written in C++.

It is composed of two components:

  • libdavix: a C++ library. it offers an HTTP API, a remote I/O API and a POSIX compatibility layer.

  • davix-*: several utilities for file transfert, large collections of files management and large files management.

DaviX supports features like session reuse, redirection caching, vector operations, Metalink, X509 client certificate, proxy certificate, SOCKS4/5 or VOMS.

Dax Toolkit

The Dax Toolkit supports the fine-grained concurrency for data analysis and visualization algorithms required to drive exascale computing. The basic computational unit of the Dax Toolkit is a worklet, a function that implements the algorithm’s behavior on an element of a mesh (that is, a point, edge, face, or cell) or a small local neighborhood. The worklet is constrained to be serial and stateless; it can access only the element passed to and from the invocation. With this constraint, the serial worklet function can be concurrently executed on an unlimited number of threads without the complications of memory clashes or other race conditions.

The Dax Toolkit provides dispatchers that apply worklets to all elements in an input mesh, the results of which are collected into a resulting mesh. Although worklets are not allowed communication, many visualization algorithms require operations such as variable array packing and coincident topology resolution that intrinsically require significant coordination among threads. Dax enables such algorithms by classifying and implementing the most common and versatile communicative operations, which, when used in conjunction with the appropriate worklets, complete the visualization algorithms.


DCCRG is an easy to use grid for FVM/FEM simulations written in C++. It handles load balancing and neighbour cell data updates between processes automatically. MPI is used for parallelization.

The distributed cartesian cell-refinable grid (dccrg) supports adaptive mesh refinement and allows an arbitrary C++ class to be used as cell data. The amount of data in grid cells can vary both in space and time allowing dccrg to be used in very different types of simulations, for example in fluid and particle codes. Dccrg transfers the data between neighboring cells on different processes transparently and asynchronously allowing one to overlap computation and communication. This enables excellent scalability at least up to 32 k cores in magnetohydrodynamic tests depending on the problem and hardware. In the version of dccrg presented here part of the mesh metadata is replicated between MPI processes reducing the scalability of adaptive mesh refinement (AMR) to between 200 and 600 processes.


DGGRID is a public domain software program for creating and manipulating Discrete Global Grids. A Discrete Global Grid (DGG) consists of a set of regions that form a partition of the Earth’s surface, where each region has a single point contained in the region associated with it. Each region/point combination is a called a cell. Depending on the application, data objects or values may be associated with the regions, points, or cells of a DGG. A Discrete Global Grid System (DGGS) is a series of discrete global grids, usually consisting of increasingly finer resolution grids (though the term DGG is often used interchangeably with the term DGGS).


We introduce the Declaratron, a system which takes a declarative approach to specifying mathematically based scientific computation. This uses displayable mathematical notation (Content MathML) and is both executable and semantically well defined. We combine domain specific representations of physical science (e.g. CML, Chemical Markup Language), MathML formulae and computational specifications (DeXML) to create executable documents which include scientific data and mathematical formulae. These documents preserve the provenance of the data used, and build tight semantic links between components of mathematical formulae and domain objects---in effect grounding the mathematical semantics in the scientific domain.


In this paper it is suggested that a stochastic isotropic diffusive process, representing a spatial first order auto regressive process (AR(1)-process), can be used as a null hypothesis for the spatial structure of climate variability. By comparing the leading empirical orthogonal functions (EOFs) of a fitted null hypothesis with EOF modes of an observed data set, inferences about the nature of the observed modes can be made. The concept and procedure of fitting the null hypothesis to the observed EOFs is in analogy to time analysis, where an AR(1)-process is fitted to the statistics of the time series in order to evaluate the nature of the time scale behavior of the time series. The formulation of a stochastic null hypothesis allows one to define teleconnection patterns as those modes that are most distinguished from the stochastic null hypothesis. The method is applied to several artificial and real data sets including the sea surface temperature of the tropical Pacific and Indian Ocean and the Northern Hemisphere wintertime and tropical sea level pressure.

A Matlab script for computing the Distinct EOFs is available.


DEPOT is a framework for easily storing and serving files in web applications on Python2.6+ and Python3.2+. Modern web applications need to rely on a huge amount of stored images, generated files and other data which is usually best to keep outside of your database. DEPOT provides a simple and effective interface for storing your files on a storage backend at your choice (Local, S3, GridFS) and easily relate them to your application models (SQLAlchemy, Ming) like you would for plain data.


Delite is a research project from Stanford University’s Pervasive Parallelism Laboratory (PPL). Delite is a compiler framework and runtime for parallel embedded domain-specific languages (DSLs). Our goal is enable the rapid construction of high performance, highly productive DSLs.

Delite is still in alpha, and there is no official release. However, the develop (Delite) and delite-develop (LMS) branches should be relatively stable for experimental development of new DSLs. For those interested in developing their own DSLs, we highly recommend using Forge, which is itself a DSL that automates much of the process of creating DSLs embedded in Scala. For those interested in using instead of building DSLs, alpha builds of OptiML, a DSL for machine learning, OptiQL, a DSL for data querying, and OptiGraph, a DSL for graph analytics, are currently available.


OptiML is an embedded domain-specific language for machine learning. OptiML is developed as a research project from Stanford University’s Pervasive Parallelism Laboratory (PPL).

OptiML is currently targeted at machine learning researchers and algorithm developers; it aims to provide a productive, high performance, MATLAB-like environment for linear algebra supplemented with machine learning specific abstractions. Our primary goal is to allow machine learning practitioners to write code in a highly declarative manner and still achieve high performance on a variety of underlying parallel, heterogeneous devices. The same OptiML program should run well and scale on a CMP (chip multi-processor), a GPU, a combination of CMPs and GPUs, clusters of CMPs and GPUs, and eventually even FPGAs and other specialized accelerators.

In particular, OptiML is designed to allow statistical inference algorithms expressible by the Statistical Query Model to be both easy to express and very fast to execute. These algorithms can be expressed in a summation form, and can be parallelized using fine-grained map-reduce operations. OptiML employs aggressive optimizations to reduce unnecessary memory allocations and fuse operations together to make these as fast as possible. OptiML also attempts to specialize implementations to particular hardware devices as much as possible to achieve the best performance.


A prototype meta DSL that generates Delite DSL implementations from a specification-like program.


DistAlgo is a very high-level language for programming distributed algorithms. This project implements a DistAlgo compiler with Python as the target language. In the following text, the name DistAlgo refers to the compiler and not the language.

Distributed Array Protocol

The Distributed Array Protocol (DAP) is a process-local protocol that allows two subscribers, called the “producer” and the “consumer” or the “exporter” and the “importer”, to communicate the essential data and metadata necessary to share a distributed-memory array between them. This allows two independently developed components to access, modify, and update a distributed array without copying. The protocol formalizes the metadata and buffers involved in the transfer, allowing several distributed array projects to collaborate, facilitating interoperability. By not copying the underlying array data, the protocol allows for efficient sharing of array data.


A tool for multidimensional variational analysis (divand) is presented. It allows the interpolation and analysis of observations on curvilinear orthogonal grids in an arbitrary high dimensional space by minimizing a cost function. This cost function penalizes the deviation from the observations, the deviation from a first guess and abruptly varying fields based on a given correlation length (potentially varying in space and time). Additional constraints can be added to this cost function such as an advection constraint which forces the analysed field to align with the ocean current. The method decouples naturally disconnected areas based on topography and topology.


D-LITe is an universal architecture for building simple application over heterogenous Sensors Networks.


A Matlab implementation of the Sparsity-Promoting Dynamic Mode Decomposition (DMDSP) algorithm. Dynamic Mode Decomposition (DMD) is an effective means for capturing the essential features of numerically or experimentally generated snapshots, and its sparsity-promoting variant DMDSP achieves a desirable tradeoff between the quality of approximation (in the least-squares sense) and the number of modes that are used to approximate available data. Sparsity is induced by augmenting the least-squares deviation between the matrix of snapshots and the linear combination of DMD modes with an additional term that penalizes the ell_1-norm of the vector of DMD amplitudes. We employ alternating direction method of multipliers (ADMM) to solve the resulting convex optimization problem and to efficiently compute the globally optimal solution.


Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications. Consisting of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows, Docker enables apps to be quickly assembled from components and eliminates the friction between development, QA, and production environments. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud.

Docker Machine

A tool that makes it really easy to go from “zero to Docker”. Machine creates Docker Engines on your computer, on cloud providers, and/or in your data center, and then configures the Docker client to securely talk to them.


An innovative feature of DOpElib is to provide a software toolkit to solve forward PDE problems as well as optimal control problems constrained by PDE. DOpElib concentrates on a unified approach for both linear and nonlinear problems by interpreting every PDE problem as nonlinear and applying a Newton method to solve it. The focus is on the numerical solution of both stationary and nonstationary problems which come from diㄦent application fields, like elasticity and plasticity, uid dynamics, and multiphysics problems such as uid-structure interactions.

Earth Orbit

An astronomically precise and accurate model that offers 3-D visualizations of Earth’s orbital geometry, Milankovitch parameters and the ensuing insolation forcing. The model is developed in MATLAB® as a user-friendly graphical user interface. Users are presented with a choice between the Berger (1978a) and Laskar et al. (2004) astronomical solutions for eccentricity, obliquity and precession. A "demo" mode is also available, which allows the Milankovitch parameters to be varied independently of each other, so that users can isolate the effects of each parameter on orbital geometry, the seasons, and insolation. A 3-D orbital configuration plot, as well as various surface and line plots of insolation and insolation anomalies on various time and space scales are produced. Insolation computations use the model’s own orbital geometry with no additional a priori input other than the Milankovitch parameter solutions.


EAVL is the Extreme-scale Analysis and Visualization Library.


An integrated development environment (IDE). It contains a base workspace and an extensible plug-in system for customizing the environment. Written mostly in Java, Eclipse can be used to develop applications. By means of various plug-ins, Eclipse may also be used to develop applications in other programming languages: Ada, ABAP, C, C, COBOL, Fortran, Haskell, JavaScript, Lasso, Lua, Natural, Perl, PHP, Prolog, Python, R, Ruby (including Ruby on Rails framework), Scala, Clojure, Groovy, Scheme, and Erlang. It can also be used to develop packages for the software Mathematica. Development environments include the Eclipse Java development tools (JDT) for Java and Scala, Eclipse CDT for C/C and Eclipse PDT for PHP, among others.


Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.


ELCIRC is an unstructured-grid model designed for the effective simulation of 3D baroclinic circulation across river-to-ocean scales. It uses a finite-volume/finite-difference Eulerian-Lagrangian algorithm to solve the shallow water equations, written to realistically address a wide range of physical processes and of atmospheric, ocean and river forcings. The numerical algorithm is low-order, but volume conservative, stable and computationally efficient. It also naturally incorporates wetting and drying of tidal flats. While originally developed to meet specific modeling challenges for the Columbia River, ELCIRC has been extensively tested against standard ocean/coastal benchmarks, and is starting to be applied to estuaries and continental shelves around the world.


A functional reactive language for interactive applications. Elm is great for 2D and 3D games, diagrams, widgets, and websites.


Embree is a collection of high-performance ray tracing kernels, developed at Intel. The target user of Embree are graphics application engineers that want to improve the performance of their application by leveraging the optimized ray tracing kernels of Embree. The kernels are optimized for photo-realistic rendering on the latest Intel® processors with support for SSE, AVX, AVX2, and the 16-wide Intel® Xeon Phi™ coprocessor vector instructions. Embree supports runtime code selection to choose the traversal and build algorithms that best matches the instruction set of your CPU. We recommend using Embree through its API to get the highest benefit from future improvements. Embree is released as Open Source under the Apache 2.0 license.


EMPIRE is the name given to a way of changing the source code of a dynamical model so that it can interface with sequential data assimilation methods.

EMPIRE should be one of the quickest and easiest ways in which to modify the source code of the model to use data assimilation.


Emscripten is an LLVM-based project that compiles C and C into highly-optimizable JavaScript in asm.js format. This lets you run C and C on the web at near-native speed, without plugins.


Equelle is a domain-specific language for the specification of simulators for systems of PDEs through a high-level syntax. The language allows the user to focus on equations and numerics while hiding the low-level details of software and hardware implementations.


The Federation of Earth Science Information Partners (ESIP) is a broad-based, distributed community of data and information technology practitioners.


Our proposed work consists of three thrust areas that address these contemporary challenges. First, we will provide high performance I/O middleware that makes effective use of computational platforms, researching a number of optimization strategies and deploying them through the HDF5 software. Second, we will improve the productivity of application developers by hiding the complexity of parallel I/O via new auto-tuning and transparent data re-organization techniques, and by extending our existing work in easy-to-use, high-level APIs that expose scientific data models. Third, we will facilitate scientific analysis for users by extending query-based techniques, developing novel in situ analysis capabilities, and making sure that visualization tools use best practices when reading HDF5 data.


The central goal of ExaStencils is to develop a radically new software technology for applications with exascale performance. To reach this goal, the project focusses on a comparatively narrow but very important application domain. The aim is to enable a simple and convenient formulation of problem solutions in this domain. The software technology developed in ExaStencils shall facilitate the highly automatic generation of a large variety of efficient implementations via the judicious use of domain-specific knowledge in each of a sequence of optimization steps such that, at the end, exascale performance results.

The application domain chosen is that of stencil codes, i.e., compute-intensive algorithms in which data points in a grid are redefined repeatedly as a combination of the values of neighboring points. The neighborhood pattern used is called a stencil. Stencils codes are used for the solution of discrete partial differential equations and the resulting linear systems.


An EXpression Capturing Finite Element Library is a library developed during my PhD as a means to explore the benefits of using active library techniques for the performance optimisation of finite-element simulations. In particular active library techniques facilitate efficient implementations of domain specific languages.

Excafé only supports triangular meshes with Lagrange basis functions at present. Furthermore boundary integrals have not yet been implemented. However, the functionality present is more than sufficient to implement an incompressible Navier-Stokes solver, which is included in the distribution.

One topic that Excafé has been used to explore is the symbolic analysis of the expressions in finite element local assembly matrices. Excafé has access to run-time representations of variational forms and basis functions. It uses this to build symbolic representations of each entry of the local assembly matrix. Once it has these, it uses a common sub-expression elimination pass targeted at polynomial evaluation to find an evaluation strategy for these expressions that minimizes operation count.


EZFIO is the Easy Fortran I/O library generator. It generates automatically an I/O library from a simple configuration file. The produced library contains Fortran subroutines to read/write the data from/to disk, and to check if the data exists. A Python and an Ocaml API are also provided.

With EZFIO, the data is organized in a file system inside a main directory. This main directory contains subdirectories, which contain files. Each file corresponds to a data. For atomic data the file is a plain text file, and for array data the file is a gzipped text file.


The EvoGrid is a worldwide, cross-disciplinary effort to create an abstract, yet plausible simulation of the chemical origins of life on Earth. One could think of this as an artificial origin of life experiment. Our strategy is to employ a large number of computers in a grid to simulate a digital primordial soup along with a distributed set of computers acting as observers looking into that grid. These observers, modeled after the very successful @Home scientific computation projects, will be looking for signs of emergent complexity and reporting back to the central grid.


The Factor programming language combines powerful language features with a full-featured library. The implementation is fully compiled for performance, while still supporting interactive development. Factor applications are portable between all common platforms. Factor can deploy stand-alone applications on all platforms.

Factor belongs to the family of concatenative languages: this means that, at the lowest level, a Factor program is a series of words (functions) that manipulate a stack of references to dynamically-typed values. This gives the language a powerful foundation which allows many abstractions and paradigms to be built on top.


The FASTMathSciDAC Institute develops and deploys scalable mathematical algorithms and software tools for reliable simulation of complex physical phenomena and collaborates with application scientists to ensure the usefulness and applicability of FASTMath technologies.


FAUST (Functional Audio Stream) is a functional programming language specifically designed for real-time signal processing and synthesis. FAUST targets high-performance signal processing applications and audio plug-ins for a variety of platforms and standards.


This project provides an LV2 plugin architecture for the Faust programming language. The package contains the Faust architecture and templates for the needed LV2 manifest (ttl) files, a collection of sample plugins written in Faust, and a generic GNU Makefile for compiling the plugins.


A virtual guitar amplifier for Linux running with jack (Jack Audio Connection Kit). It takes the signal from your guitar as any real amp would do: as a mono-signal from your sound card. Your tone is processed by a main amp and a rack-section. Both can be routed separately and deliver a processed stereo-signal via Jack. You may fill the rack with effects from more than 25 built-in modules spanning from a simple noise-gate to brain-slashing modulation-fx like flanger, phaser or auto-wah. Your signal is processed with minimum latency. On any properly set-up Linux-system you do not need to wait for more than 10 milli-seconds for your playing to be delivered, processed by guitarix. It offers the range of sounds you would expect from a full-featured universal guitar-amp. A great part of guitarix effects is written in Faust.


The FEAST solver package is a free high-performance numerical library for solving the standard or generalized eigenvalue problem, and obtaining all the eigenvalues and eigenvectors within a given search interval. It is based on an innovative fast and stable numerical algorithm — named the FEAST algorithm — which deviates fundamentally from the traditional Krylov subspace iteration based techniques (Arnoldi and Lanczos algorithms) or other Davidson-Jacobi techniques. The FEAST algorithm takes its inspiration from the density-matrix representation and contour integration technique in quantum mechanics. It is free from explicit orthogonalization procedures, and its main computational tasks consist of solving very few inner independent linear systems with multiple right-hand sides and one reduced eigenvalue problem orders of magnitude smaller than the original one. The FEAST algorithm combines simplicity and efficiency and offers many important capabilities for achieving high performance, robustness, accuracy, and scalability on parallel architectures.

This general purpose FEAST solver package includes both reverse communication interfaces and ready to use predefined interfaces for dense, banded and sparse systems. It includes double and single precision arithmetic, and all the interfaces are compatible with Fortran (77,90) and C. FEAST is both a comprehensive library package, and an easy to use software. This solver is expected to significantly augment numerical performances and capabilities in large-scale modern applications.

Fedora Playground Repository

The Playground repository gives contributors a place to host packages that are not up to the standards of the main Fedora repository but may still be useful to other users. For now the Playground repository contains both packages that are destined for eventual inclusion into the main Fedora repository and packages that are never going to make it there. Users of the repository should be willing to endure a certain amount of instability when using packages from there.


Feelpp is a Cpp library for partial differential equation solves using generalized Galerkin methods such as the finite element method, the h/p finite element method, the spectral element method or the reduced basis method.


Fiber ViewerLight is an open-source C++ application to analyze fiber bundles. FiberViewerLight is now available as a 3D Slicer extension.


A library for fast computation of Gauss transforms in multiple dimensions, using the Improved Fast Gauss Transform and Approximate Nearest Neighbor searching. This software allows for efficient computation of probabilities by Kernel Density Estimation (KDE), and can reduce complexity of algorithms commonly used in Computer Vision, Machine Learning, etc, that must evaluate the Gauss transform.


Fiona is designed to be simple and dependable. It focuses on reading and writing data in standard Python IO style, and relies upon familiar Python types and protocols such as files, dictionaries, mappings, and iterators instead of classes specific to OGR. Fiona can read and write real-world data using multi-layered GIS formats and zipped virtual file systems and integrates readily with other Python GIS packages such as pyproj, Rtree, and Shapely.


Firedrake is an automated system for the portable solution of partial differential equations using the finite element method (FEM). Firedrake enables users to employ a wide range of discretisations to an infinite variety of PDEs and employ either conventional CPUs or GPUs to obtain the solution.

Firedrake employs the Unifed Form Language (UFL) and FEniCS Form Compiler (FFC) from the FEniCS Project while the parallel execution of FEM assembly is accomplished by the PyOP2 system. The global mesh data structures, as well as linear and non-linear solvers, are provided by PETSc.


Firm is a C-library that provides a graph-based intermediate representation, optimizations, and assembly code generation suitable for use in compilers.

Flex Projector

Flex Projector is a freeware, cross-platform application for creating custom world map projections. The intuitive interface allows users to easily modify dozens of popular world map projections—the possibilities range from slight adjustments to making completely new projections. Flex Projector is intended as a tool for practicing mapmakers and students of cartography.


FlowPy is a numerical toolbox for the solution of partial differential equations encountered in Functional Renormalization Group equations. This toolbox compiles flow equations to fast machine code and is able to handle coupled systems of flow equations with full momentum dependence, which furthermore may be given implicitly.


This software is capable of reading in 2D or 3D velocity data and computing FTLE fields, computing tracer/particle trajectories, and interpolating the velocity data onto another mesh.[]


An open source, general purpose, multi-phase computational fluid dynamics code capable of numerically solving the Navier-Stokes equation and accompanying field equations on arbitrary unstructured finite element meshes in one, two and three dimensions. It is used in a number of different scientific areas including geophysical fluid dynamics, computational fluid dynamics, ocean modelling and mantle convection. It uses a finite element/control volume method which allows arbitrary movement of the mesh with time dependent problems, allowing mesh resolution to increase or decrease locally according to the current simulated state. It has a wide range of element choices including mixed formulations. Fluidity is parallelised using MPI and is capable of scaling to many thousands of processors. Other innovative and novel features are a user-friendly GUI and a python interface which can be used to calculate diagnostic fields, set prescribed fields or set user-defined boundary conditions.


ForestClaw is a parallel, multi-block adaptive finite volume library for solving PDEs on mapped, logically Cartesian meshes. For solving hyperbolic problems using explicit, single step algorithms, ForestClaw’s block-structured adaptive algorithm, including multi-rate time stepping uses the Berger, Oliger and Colella AMR algorithms (JCP, 1984, 1989). The hyperbolic solvers are currently based on ClawPack (R. J. LeVeque). Future plans include support for general method-of-lines solvers in a multi-rate setting.

Where ForestClaw departs from the standard Berger-Oliger-Colella block-structured approach is that the multi-resolution grid hierarchy is not stored as overlapping, nested grids but rather as a composite structure of non-overlapping fixed sized grids, each of which is stored as a leaf in a forest of quad- or octrees.


ForestGOMP is an OpenMP runtime compatible with GCC 4.2, offering a structured way to efficiently execute OpenMP applications onto hierarchical (NUMA) architectures.


FortranCL is an OpenCL interface for Fortran 90. It allows programmers to call the OpenCL parallel programming framework directly from Fortran, so developers can accelerate their Fortran code using graphical processing units (GPU) and other accelerators.

The interface is designed to be as close to C OpenCL interface as possible, while written in native Fortran 90 with type checking. It was originally designed as an OpenCL interface to be used by the Octopus code.

The interface is not complete but provides all the basic calls required to write a full Fortran 90 OpenCL program.


Freenet is free software which lets you anonymously share files, browse and publish "freesites" (web sites accessible only through Freenet) and chat on forums, without fear of censorship. Freenet is decentralised to make it less vulnerable to attack, and if used in "darknet" mode, where users only connect to their friends, is very difficult to detect.

Communications by Freenet nodes are encrypted and are routed through other nodes to make it extremely difficult to determine who is requesting the information and what its content is.

Users contribute to the network by giving bandwidth and a portion of their hard drive (called the "data store") for storing files. Files are automatically kept or deleted depending on how popular they are, with the least popular being discarded to make way for newer or more popular content. Files are encrypted, so generally the user cannot easily discover what is in his datastore, and hopefully can’t be held accountable for it. Chat forums, websites, and search functionality, are all built on top of this distributed data store.


GALAHAD is a thread-safe library of Fortran 2003 packages for solving nonlinear optimization problems. At present, the areas covered by the library are unconstrained and bound-constrained optimization, quadratic programming, nonlinear programming, systems of nonlinear equations and inequalities, and nonlinear least squares problems.


Galry is a high performance interactive visualization package in Python based on OpenGL. It allows to interactively visualize very large plots (tens of millions of points) in real time, by using the graphics card as much as possible.

Galry’s high-level interface is directly inspired by Matplotlib and Matlab. The low-level interface can be used to write complex interactive visualization GUIs with Qt that deal with large 2D/3D datasets.

Visualization capabilities of Galry are not restricted to plotting, and include textures, 3D meshes, graphs, shapes, etc. Custom shaders can also be written for advanced uses.


Robot simulation is an essential tool in every roboticist’s toolbox. A well-designed simulator makes it possible to rapidly test algorithms, design robots, and perform regression testing using realistic scenarios. Gazebo offers the ability to accurately and efficiently simulate populations of robots in complex indoor and outdoor environments. At your fingertips is a robust physics engine, high-quality graphics, and convenient programmatic and graphical interfaces.


The GeM software is designed to automate the generation of determining equations and related operations, in order to compute symmetries and conservation laws for any ODE/PDE system, generally without limitations in DE order and number of variables.

ODE/PDE systems containing arbitrary functions and/or constants can be analyzed, and classes of functions for which additional symmetries / conservation laws occur can be isolated.

GeM output (determining equations) is usually fed into Maple "rifsimp" (a stable routine for differential reduction), which simplifies determining equations, and performs case splits when the given system contains arbitrary functions and/or constants.

GeM also contains special routines to output computed symmetries as well as fluxes/densities of computed conservation laws.


JavaScript Geo visualization and Analysis Library.


GeoNode is a web-based application and platform for developing geospatial information systems (GIS) and for deploying spatial data infrastructures (SDI). Data management tools built into GeoNode allow for integrated creation of data, metadata, and map visualizations. Each dataset in the system can be shared publicly or restricted to allow access to only specific users. Social features like user profiles and commenting and rating systems allow for the development of communities around each platform to facilitate the use, management, and quality control of the data the GeoNode instance contains.


A control system framework for personal fabrication.


Gestalt is a framework for building controllers for automated tools. It enables you to import your machines as Python modules, and makes it easy to connect machines to browser-based user interfaces.


Da bomb.


A git repository browser that can generate static HTML instead of having to run dynamically.

It is smaller, with less features and a different set of tradeoffs than other similar software, so if you’re looking for a robust and featureful git browser, please look at gitweb or cgit instead.

However, if you want to generate static HTML at the expense of features, then it can be useful.


The hub subcommand for git, allows you to perform many of the operations made available by GitHub’s v3 REST API, from the git commandline command.

You can fork, create, delete and modify repositories. You can get information about users, repositories and issues. You can star, watch and follow things, and find out who else is doing the same. The API is quite extensive. With this command you can do many of your day to day GitHub actions without needing a web browser.

You can also chain commands together using the output of one as the input of another. For example you could use this technique to clone all the repos of a GitHub user or organization, with one command.


GitLab is an advanced Git-repository manager. It introduces a powerful code review and issue-tracking system, complete with GitLab CI: a powerful continuous integration tool.


Gitless is an experimental version control system built on top of Git. Many people complain that Git is hard to use. We think the problem lies deeper than the user interface, in the concepts underlying Git. Gitless is an experiment to see what happens if you put a simple veneer on an app that changes the underlying concepts. Because Gitless is implemented on top of Git (could be considered what Git pros call a porcelain of Git), you can always fall back on Git. And of course your coworkers you share a repo with need never know that you’re not a Git aficionado.


This git command "clones" an external git repo into a subdirectory of your repo. Later on, upstream changes can be pulled in, and local changes can be pushed back. Simple.


Gizeh is a Python library for vector graphics. Gizeh is written on top of the module cairocffi, which is a Python binding of the popular C library Cairo. Cairo is powerful, but difficult to learn and use. Gizeh implements a few classes on top of Cairo that make it more intuitive.


This site summarises the 1D lake water balance and vertical stratification model: “The General Lake Model” (GLM).


Lasso and elastic-net regularized generalized linear models. This is a Matlab port for the extremely efficient procedures for fitting the entire lasso or elastic-net path for linear regression, logistic and multinomial regression, Poisson regression the Cox model.


This page contains the codes for learning the Granger causality in different settings. The codes are written in Matlab and depend on the GLMnet package for performing Lasso. Lasso-Granger is an efficient algorithm for learning the temporal dependency among multiple time series based on variable selection using Lasso. Copula-Granger extends the power of Lasso-Granger to non-linear datasets. It uses the copula technique to separate the marginal properties of the joint distribution from its dependency structure.


We describe glsim, a C++ library designed to provide routines to perform basic housekeeping tasks common to a very wide range of simulation programs, such as reading simulation parameters or reading and writing self-describing binary files with simulation data. The design also provides a framework to add features to the library while preserving its structure and interfaces.


Glumpy is a python library for scientific visualization that is both fast, scalable and beautiful. Glumpy offers an intuitive interface between numpy and modern OpenGL.


Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.

Google Earth Engine

Google Earth Engine brings together the world’s satellite imagery — trillions of scientific measurements dating back almost 40 years — and makes it available online with tools for scientists, independent researchers, and nations to mine this massive warehouse of data to detect changes, map trends and quantify differences on the Earth’s surface. Applications include: detecting deforestation, classifying land cover, estimating forest biomass and carbon, and mapping the world’s roadless areas.


GPI-2 implements the GASPI specification (, an API specification which originates from the ideas and concepts GPI. GPI-2 is an API for asynchronous communication. It provides a flexible, scalable and fault tolerant interface for parallel applications.


GPTIPS is a free symbolic data mining platform and interactive modelling environment for MATLAB.


Gradle is an open source build automation system. Gradle can automate the building, testing, publishing, deployment and more of software packages or other types of projects such as generated static websites, generated documentation or indeed anything else.

Gradle is a project automation tool that builds upon the concepts of Apache Ant and Apache Maven and introduces a Groovy-based domain-specific language (DSL) instead of the more traditional XML form of declaring the project configuration.

Gradle was designed for multi-project builds which can grow to be quite large, and supports incremental builds by intelligently determining which parts of the build tree are up-to-date, so that any task dependent upon those parts will not need to be re-executed.

The initial plugins are primarily focused around Java, Groovy and Scala development and deployment, but more languages and project workflows are on the roadmap.


Graphite is an open-source, distributed parallel simulator for multicore architectures. Graphite is designed from the ground up for exploration of future multicore processors containing dozens, hundreds, or even thousands of cores. It provides high performance for fast design space exploration and software development.


The Monash simple climate model is based on the Globally Resolved Energy Balance (GREB) model, which is a climate model published by Dommenget and Floeter [2011] in the international peer review science journal Climate Dynamics. The model simulates most of the main physical processes in the climate system in a very simplistic way and therefore allows very fast and simple climate model simulations. It can compute global climate simulations of one year in about 1 second on a normal PC computer. Despite its simplicity the model simulates the climate response to external forcings, such as doubling of the CO2 concentrations very realistically (similar to state of the art climate models).


In gRPC a client application can directly call methods on a server application on a different machine as if it was a local object, making it easier for you to create distributed applications and services. As in many RPC systems, gRPC is based around the idea of defining a service, specifying the methods that can be called remotely with their parameters and return types. On the server side, the server implements this interface and runs a gRPC server to handle client calls. On the client side, the client has a stub that provides exactly the same methods as the server.

gRPC clients and servers can run and talk to each other in a variety of environments - from servers inside Google to your own desktop - and can be written in any of gRPC’s supported languages. So, for example, you can easily create a gRPC server in Java with clients in Go, Python, or Ruby. In addition, the latest Google APIs will have gRPC versions of their interfaces, letting you easily build Google functionality into your applications.


The community GSI system is a variational data assimilation system, designed to be flexible, state-of-art, and run efficiently on various parallel computing platforms. The GSI system is in the public domain and is freely available for community use.

The Developmental Testbed Center (DTC) currently maintains and supports a community version of the GSI system (now at Version 3.3). The testing and support of this GSI system at the DTC currently focus on regional numerical weather prediction (NWP) applications coupled with the Weather Research and Forecasting (WRF) Model , but the GSI can be applied to Global Forecast System(GFS) as well as other modelling systems.


A Fortran90 input/output library, "gtool5", is developed for use with numerical simulation models in the fields of Earth and planetary sciences. The use of this library will simplify implementation of input/output operations into program code in a consolidated form independent of the size and complexity of the software and data. The library also enables simple specification of the metadata needed for post-processing and visualization of the data. These aspects improve the readability of simulation code, which facilitates the simultaneous performance of multiple numerical experiments with different software and efficiency in examining and comparing the numerical results.


GUESS is an exploratory data analysis and visualization tool for graphs and networks. The system contains a domain-specific embedded language called Gython (an extension of Python, or more specifically Jython) which supports the operators and syntactic sugar necessary for working on graph structures in an intuitive manner. An interactive interpreter binds the text that you type in the interpreter to the objects being visualized for more useful integration. GUESS also offers a visualization front end that supports the export of static images and dynamic movies.


Gun is a persisted distributed cache, part of a NoDB movement. It requires zero maintenance and runs on your own infrastructure. Think of it as "Dropbox for Databases" or a "Self-hosted Firebase". This is an early preview, so check out the github and read on.

Everything gets cached, so your users experience lightning fast response times. Since gun can be embedded anywhere javascript can run, that cache can optionally be right inside your user’s browser using localstorage fallbacks. Updates are then pushed up to the servers when the network is available.


The H5FDdsm project provides a Virtual File Driver for HDF5, which can be used to link two applications via a virtual file system. One application (server/host) owns a memory buffer, which may be distributed over N processes (DSM buffer) - the second application (client) writes to HDF5 in parallel using M processes and the data is diverted to the DSM host, where it can be read in parallel as if from disk. The file system is bypassed completely and the data is transmitted using one of several network protocols (MPI or TCP over sockets currently supported). Note that the interface can also be used within the same application as a parallel data staging layer, in this case, no connection is required and information is exchanged between processes using MPI.


H5Part is a very simple data storage schema and provides an API that simplifies the reading/writing of the data to the HDF5 file format.

H5Part is a very simple data storage schema and provides an API that simplifies the reading/writing of the data to the HDF5 file format. An important foundation for a stable visualization and data analysis environment is a stable and portable file storage format and its associated APIs. The presence of a "common file storage format," including associated APIs, will help foster a fundamental level of interoperability across the project’s software infrastructure. It will also help ensure that key data analysis capabilities are present during the earliest phases of the software development effort.


ICARUS is a ParaView plug-in interfaced around the H5FDdsm driver for steering and visualizing in-situ HDF5 output of simulation codes.


The Habanero-C (HC) language under development in the Habanero project at Rice University builds on past work on Habanero-Java, which in turn was derived from X10 v1.5. HC serves as a research testbed for new compiler and runtime software technologies for extreme scale systems for homogeneous and heterogeneous processors.

Habanero-C is designed to be mapped onto hardware platforms with lightweight system software stacks, such as the Customizable Heterogeneous Platform (CHP) being developed in the NSF Expeditions Center for Domain-Specific Computing (CDSC) which includes CPUs, GPUs, and FPGAs. The C foundation also makes it easier to integrate HC with communication middleware for cluster systems, such as MPI and GASNet.

The Habanero-C compiler is written in C++ and is built on top of the ROSE compiler infrastructure, which was also used in the DARPA-funded PACE project at Rice University. The bulk of the Habanero-C runtime has been written from scratch in portable ANSI C. However, a few library routines for low-level synchronization and atomic operations are written in assembly language for the target platform. To date, the Habanero-C runtime has been ported and tested on Intel X86, Cyclops 64, Power7, Sun Niagara 2 and Intel SCC multicore platforms.


HClib is a library implementation of the Habanero-C language. The reference HClib implementation is built on top of the Open Community Runtime (OCR).


The CnC-Python system under development in the Habanero project at Rice University builds on past work on the Intel Concurrent Collections (CnC) and Habanero CnC projects.


How can we find useful patterns and anomalies in large scale real-world data with multiple attributes? Tensors are suitable for modeling these multidimensional data, and widely used for the analysis of social networks, web data, network traffic, and in many other settings. HaTen2 is a scalable distributed algorithm of tensor decomposition for large scale tensors running on the MapReduce platform. HaTen2 decomposes 100X larger tensors compared to existing methods.


The Haxe programming language is a high level strictly typed programming language which is used by the Haxe compiler to produce cross-platform native code. The Haxe programming language is easy to learn if you are familiar already with either Java,C++,PHP,AS3 or similar object oriented languages. The Haxe programming language has been especially designed in order to adapt the various platforms native behaviors and allow efficient cross-platform development.

The Haxe Compiler is responsible for translating the Haxe programming language to the target platform native source code or binary. Each platform is natively supported, without any overhead coming from running inside a virtual machine. The Haxe Compiler is very efficient and can compile thousands of classes in seconds.

The Haxe standard library provides a common set of highly tested APIs that gives you complete cross-platform behavior. This includes data structures, maths and date, serialization, reflection, bytes, crypto, file system, database access, etc. The Haxe standard library also includes platform-specific API that gives you access to important parts of the platform capabilities, and can be easily extended.

The compiler targets include Flash, Neko, Javascript, Actionscript 3, PHP, C++, Java, Csharp and Python.

Haxe is written in OCaml.

Haxe UI

Create cross-platform, rich user interfaces. Quickly with a single framework.


Massive provide a number of open source libraries and tools that are intended to increase the quality, efficiency and consistency of cross-platform development with Haxe.


Haxelib which downloads node-webkit binary for your platform and makes it accessible via haxelib run node-webkit path/to/index.html. Node Webkit lets you run a Webkit shell on the desktop, meaning you can use Haxe and HTML5 / JS technologies to build your app. It provides full access to the NodeJS APIs so your app can integrate with the system.


Use WxWidgets to create desktop apps with a truly native look and feel on all major platforms. Works with the C++ and Neko targets, and integrates with NME.


An important goal of scientific data analysis is to understand the behavior of a system or process based on a sample of the system. In many instances it is possible to observe both input parameters and system outputs, and characterize the system as a high-dimensional function. Such data sets arise, for instance, in large numerical simulations, as energy landscapes in optimization problems, or in the analysis of image data relating to biological or medical parameters. This paper proposes an approach to analyze and visualizing such data sets. The proposed method combines topological and geometric techniques to provide interactive visualizations of discretely sampled high-dimensional scalar fields. The method relies on a segmentation of the parameter space using an approximate Morse-Smale complex on the cloud of point samples. For each crystal of the Morse-Smale complex, a regression of the system parameters with respect to the output yields a curve in the parameter space. The result is a simplified geometric representation of the Morse-Smale complex in the high dimensional input domain. Finally, the geometric representation is embedded in 2D, using dimension reduction, to provide a visualization platform. The geometric properties of the regression curves enable the visualization of additional information about each crystal such as local and global shape, width, length, and sampling densities. The method is illustrated on several synthetic examples of two dimensional functions. Two use cases, using data sets from the UCI machine learning repository, demonstrate the utility of the proposed approach on real data. Finally, in collaboration with domain experts the proposed method is applied to two scientific challenges. The analysis of parameters of climate simulations and their relationship to predicted global energy flux and the concentrations of chemical species in a combustion simulation and their integration with temperature.


Adaptive, or self-aware, computing has been proposed as one method to help application programmers confront the growing complexity of multicore software development.

However, existing approaches to adaptive systems are largely ad hoc and often do not manage to incorporate the true performance goals of the applications they are designed to support.

This project proposed an enabling technology for adaptive computing systems: Application Heartbeats. The Application Heartbeats framework provides a simple, standard programming interface that applications can use to indicate their performance and system software (and hardware) can use to query an application’s performance.


CUDA C/C++ and the NVIDIA NVCC compiler toolchain support a number of features designed to make it easier to write portable code, including language integration of host and device code and data, declaration specifiers (e.g. host and device) and preprocessor definitions (CUDACC). These features combine to enable developers to write code that can be compiled and run on either the host, the device, or both. Other compilers don’t recognize these features, however, so to really write portable code, we need preprocessor macros. This is where Hemi comes in.


HHVM is an open-source virtual machine designed for executing programs written in Hack and PHP. HHVM uses a just-in-time (JIT) compilation approach to achieve superior performance while maintaining the development flexibility that PHP provides.


A massively-parallel high-performance x-ray scattering data analysis code. HipGISAXS is a massively parallel software, which we have developed using C++, augmented with MPI, Nvidia CUDA, OpenMP, and parallel-HDF5 libraries, on large-scale clusters of multi/many-cores and graphics processors. HipGISAXS currently supports *NIX based systems, and is able to harness computational power from any general-purpose CPUs including state-of-the-art multicores, as well as Nvidia GPUs and Intel MIC coprocessors. It is able to handle large input data including any custom complex morphology as described in the following, and perform GISAXS simulations at high resolutions.


HLib is a program library for hierarchical matrices and H2-matrices. H-matrices are a powerful tool for representing and working with dense (and sparse) matrices, e.g. from integral or partial differential equations. They allow the complete matrix algebra, e.g. matrix-vector multiplication, matrix addition, multiplication, inversion and factorisation in almost linear time with respect to the number of rows and columns.

HLIBpro contains various algorithms for the approximation of dense matrices, e.g. ACA and HCA, the complete set of available H-algebra, various clustering techniques, e.g. geometric and algebraic clustering, many functions for discretising integral equations, e.g. Laplace, Helmholtz and Maxwell equations. A special focus of HLIBpro lies in the parallelisation of these methods to shared (threads) and distributed memory machines (MPI).


Higher Order (Symplectic) Methods in Python are explicit algorithms for higher order symplectic integration of a large class of Hamilton’s equations have recently been discussed by Mushtaq et al. Here we present a Python program for automatic numerical implementation of these algorithms for a given Hamiltonian, both for double precision and multiprecision computations. We provide examples of how to use this program, and illustrate behavior of both the code generator and the generated solver module(s).


HOP is a multi-tier programming language for the Web 2.0 and the so-called diffuse Web. It is designed for programming interactive web applications in many fields such as multimedia (web galleries, music players, …​), ubiquitous and house automation (SmartPhones, personal appliance), mashups, office (web agendas, mail clients, …​), etc.

HOP features include:

  • an extensive set of widgets for programming fancy and portable Web GUIs,

  • full compatibility with traditional Web technologies (JavaScript, HTML, CSS),

  • HTML5 support,

  • a versatile Web server supporting HTTP/1.0 and HTTP/1.1,

  • native multimedia support for enabling ubiquitous Web multimedia applications,

  • fast WebDAV level 1 support,

  • an optimizing native code compiler for server code,

  • an on-the-fly JavaScript compiler for client code,

  • an extensive set of libraries for the mail, calendars, databases, Telephony


A Python Just-In-Time compiler for astrophysical computations. In order to combine the ease of Python and the speed of C, we developed HOPE, a specialised Python just-in-time (JIT) compiler designed for numerical astrophysical applications. HOPE focuses on a subset of the language and is able to translate Python code into C while performing numerical optimisation on mathematical expressions at runtime. To enable the JIT compilation, the user only needs to add a decorator to the function definition. We assess the performance of HOPE by performing a series of benchmarks and compare its execution speed with that of plain Python, C and the other existing frameworks. We find that HOPE improves the performance compared to plain Python by a factor of 2 to 120, achieves speeds comparable to that of C, and often exceeds the speed of the existing solutions.


HPC-GAP is the EPSRC funded project to reengineer the software for computation in algebra and discrete mathematics to take advantage of the power of current and future high-performance computers. Our main focus is on the GAP system and the more recent SymGridPar middleware, which provide flexible and effective computation on single processors and small clusters. We will adapt the software to efficiently use large clusters of multi-core processors to perform larger computations. To demonstrate the effectiveness of our adaptations we will apply our new software to problems from a number of important areas of pure mathematics.


HPGMG implements full multigrid (FMG) algorithms using finite-volume and finite-element methods. Different algorithmic variants adjust the arithmetic intensity and architectural properties that are tested. These FMG methods converge up to discretization error in one F-cycle, thus may be considered direct solvers. An F-cycle visits the finest level a total of two times, the first coarsening (8x smaller) 4 times, the second coarsening 6 times, etc.

HPGMG-FV solves constant- and variable-coefficient elliptic problems on isotropic Cartesian grids using Full Multigrid (FMG). The method is second-order accurate in the max norm, as demonstrated by the FMG convergence. FMG interpolation (prolongation) is linear and V-cycle interpolation and restriction are piecewise constant. Recursive decomposition is used to construct a space filling curve akin to Z-Mort in order to distribute work among processes. Chebyshev polynomials are used for smoothing, preconditioned by the diagonal. FMG convergence is observed with a fourth order Chebyshev polynomial using a V(4,4) cycle. Thus convergence is reached in a total of 9 fine-grid operator applications (4 presmooths, residual, 4 postsmooths). This makes HPGMG-FV extremeley fast and energy efficient.


HSL (formerly the Harwell Subroutine Library) is a collection of state-of-the-art packages for large-scale scientific computation written and developed by the Numerical Analysis Group at the STFC Rutherford Appleton Laboratory and other experts. HSL offers users a high standard of reliability and has an international reputation as a source of robust and efficient numerical software. Among its best known packages are those for the solution of sparse linear systems of equations and sparse eigenvalue problems. MATLAB interfaces are offered for selected packages.

The Library was started in 1963 and was originally used at the Harwell Laboratory on IBM mainframes running under OS and MVS. Over the years, the Library has evolved and has been extensively used on a wide range of computers, from supercomputers to modern PCs. Recent additions include optimised support for multicore processors.

HSL packages are available at no cost for academic research and teaching. See download links for individual packages in the catalogue.


Bring the best of JavaScript data visualization to R. Use JavaScript visualization libraries at the R console, just like plots. Embed widgets in R Markdown documents and Shiny web applications. Develop new widgets using a framework that seamlessly bridges R and JavaScript.


Hugo is a general-purpose website framework. Technically speaking, Hugo is a static site generator. This means that, unlike systems like WordPress, Ghost and Drupal, which run on your web server expensively building a page every time a visitor requests one, Hugo does the building when you create your content. Since websites are viewed far more often than they are edited, Hugo is optimized for website viewing while providing a great writing experience.

Sites built with Hugo are extremely fast and very secure. Hugo sites can be hosted anywhere, including Heroku, GoDaddy, DreamHost, GitHub Pages, Amazon S3 and CloudFront, and work well with CDNs. Hugo sites run without dependencies on expensive runtimes like Ruby, Python or PHP and without dependencies on any databases.


The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, …​) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information as well as the locality of I/O devices such as network interfaces, InfiniBand HCAs or GPUs. It primarily aims at helping applications with gathering information about modern computing hardware so as to exploit it accordingly and efficiently.


HSQLDB (HyperSQL DataBase) is the leading SQL relational database software written in Java. It offers a small, fast multithreaded and transactional database engine with in-memory and disk-based tables and supports embedded and server modes. It includes a powerful command line SQL tool and simple GUI query tools.


IBEX is a C++ library for constraint processing over real numbers. It provides reliable algorithms for handling non-linear constraints. In particular, roundoff errors are also taken into account. It is based on interval arithmetic and affine arithmetic. The main feature of Ibex is its ability to build strategies declaratively through the contractor programming paradigm. It can also be used as a black-box solver.

It can be used to solve a variety of problems that can roughly be formulated as to find a reliable characterization with boxes (Cartesian product of intervals) of sets implicitely defined by constraints. Reliable means that all sources of uncertainty should be taken into account, including:

  • approximation of real numbers by floating-point numbers

  • round-off errors

  • linearization truncatures

  • model parameter uncertainty

  • measurement noise


ICALAB for Signal Processing and ICALAB for Image Processing are two independent demo packages for MATLAB that implement a number of efficient algorithms for ICA (independent component analysis) employing HOS (higher order statistics), BSS (blind source separation) employing SOS (second order statistics) and LP (linear prediction), and BSE (blind signal extraction) employing various SOS and HOS methods.


An open-source, distributed, time series database with no external dependencies.


Instrumentino is an open-source modular graphical user interface framework for controlling Arduino based experimental instruments. It expands the control capability of Arduino by allowing instruments builders to easily create a custom user interface program running on an attached personal computer.It enables the definition of operation sequences and their automated running without user intervention.

Acquired experimental data and a usage log are automatically saved on the computer for further processing.

Complex devices, which are difficult to control using an Arduino, may be integrated as well by incorporating third party application programming interfaces (APIs) into the Instrumentino framework.

Interactive Spaces

Interactive Spaces is a software platform which allows you to merge the virtual world with the physical world. By making it easy to connect sensors to applications running on different machines in a space, quite complex behaviors can be built.

Interactive Spaces applications are build from units called Activities which can easily communicate with each other no matter where they are on the local network. Through the use of Interactive Spaces communication system, called a route, any activity in the space can speak to or listen to messages from any other activities that it chooses to. This means you can easily control and synchronize events across a collection of machines.


Trendy stuff about wee hardware running this sort of software.


ActiveMQ Apollo is a faster, more reliable, easier to maintain messaging broker built from the foundations of the original ActiveMQ. It accomplishes this using a radically different threading and message dispatching architecture. Like ActiveMQ, Apollo is a multi-protocol broker and supports STOMP, AMQP, MQTT, Openwire, SSL, and WebSockets.


Californium (Cf) is an open source implementation of the Constrained Application Protocol (CoAP). It is written in Java and targets unconstrained environments such as back-end service infrastructures (e.g., proxies, resource directories, or cloud services) and less constrained environments such as embedded devices running Linux (e.g., smart home/factory controllers or cellular gateways). Californium (Cf) has been running code for the IETF standardization of CoAP and was recently reimplemented from scratch having all the experience. In particular, Cf focuses now on service scalability for large-scale Internet of Things applications. The new implementation was successfully tested at the ETSI CoAP and OMA LWM2M Plugtests in November 2013 and March 2014. It complies with all mandatory and optional test cases.


An integration middleware for the Internet of Things. It provides a communication stack for embedded devices based on IPv6, Web services and oBIX to provide interoperable interfaces for smart objects. Using 6LoWPAN for constrained wireless networks and the Constrained Application Protocol together with Efficient XML Interchange an efficient stack is provided allowing using interoperable Web technologies in the field of sensor and actuator networks and systems while remaining nearly as efficient regarding transmission message sizes as existing automation systems. The IoTSyS middleware aims providing a gateway concept for existing sensor and actuator systems found in nowadays home and building automation systems, a stack which can be deployed directly on embedded 6LoWPAN devices and further addresses security, discovery and scalability issues.


Kura is a Java/OSGi-based framework for IoT gateways. Kura APIs offer access to the underlying hardware (serial ports, GPS, watchdog, GPIOs, I2C, etc.), management of network configurations, communication with M2M/IoT Integration Platforms, and gateway management.


The Mihini project delivers an embedded runtime running on top of Linux, that exposes a high-level Lua API for building Machine-to-Machine applications.


MQTT stands for MQ Telemetry Transport. It is a publish/subscribe, extremely simple and lightweight messaging protocol, designed for constrained devices and low-bandwidth, high-latency or unreliable networks. The design principles are to minimise network bandwidth and device resource requirements whilst also attempting to ensure reliability and some degree of assurance of delivery. These principles also turn out to make the protocol ideal of the emerging “machine-to-machine” (M2M) or “Internet of Things” world of connected devices, and for mobile applications where bandwidth and battery power are at a premium.


The Mosquitto project provides an open-source implementation of an MQTT broker. It implements the MQ Telemetry Transport protocol versions 3.1 and 3.1.1. MQTT provides a lightweight method of carrying out messaging using a publish/subscribe model. This makes it suitable for "machine to machine" messaging such as with low power sensors or mobile devices such as phones, embedded computers or microcontrollers like the Arduino.

The Mosquitto broker is the focus of the project and aims to be a lightweight and function MQTT broker that can run on relatively constrained systems, but still be powerful enough for a wide range of applications. The mosquitto_pub and mosquitto_sub command line utilities provide a straightforward and powerful way of interacting with your broker. The client library that the utilities use for their MQTT support can be used to develop your own MQTT applications.


A web interface for MQTT. A simple web interface which is able to subscribe to a MQTT topic and display the information.


An open source utility intended to help you with monitoring activity on MQTT topics. It’s been designed to deal with high volumes of messages, as well as occasional publications. A JavaFX application that should work on any operating system with an appropriate version of Java 8 installed. mqtt-spy-daemon is a Java-based command line tool that does not require a GUI environment. Basic functionality works with Java 7, whereas some of the advanced features like scripting require Java 8 to be installed.


The Paho project provides open-source client implementations of open and standard messaging protocols aimed at new, existing, and emerging applications for Machine‑to‑Machine (M2M) and Internet of Things (IoT).


The Internet of Things (IoT) is set to occupy a substantial component of future Internet. The IoT connects sensors and devices that record physical observations to applications and services of the Internet. As a successor to technologies such as RFID and Wireless Sensor Networks (WSN), the IoT has stumbled into vertical silos of proprietary systems, providing little or no interoperability with similar systems. As the IoT represents future state of the Internet, an intelligent and scalable architecture is required to provide connectivity between these silos, enabling discovery of physical sensors and interpretation of messages between things. This paper proposes a gateway and Semantic Web enabled IoT architecture to provide interoperability between systems using established communication and data standards. The Semantic Gateway as Service (SGS) allows translation between messaging protocols such as XMPP, CoAP and MQTT via a multi-protocol proxy architecture. Utilization of broadly accepted specifications such as W3C’s Semantic Sensor Network (SSN) ontology for semantic annotations of sensor data provide semantic interoperability between messages and support semantic reasoning to obtain higher-level actionable knowledge from low-level sensor data.


A software for integrating different home automation systems and technologies into one single solution that allows over-arching automation rules and that offers uniform user interfaces. The open Home Automation Bus (openHAB) project aims at providing a universal integration platform for all things around home automation. It is a pure Java solution, fully based on OSGi. The Equinox OSGi runtime and Jetty as a web server build the core foundation of the runtime.

It is designed to be absolutely vendor-neutral as well as hardware/protocol-agnostic. openHAB brings together different bus systems, hardware devices and interface protocols by dedicated bindings. These bindings send and receive commands and status updates on the openHAB event bus. This concept allows designing user interfaces with a unique look&feel, but with the possibility to operate devices based on a big number of different technologies. Besides the user interfaces, it also brings the power of automation logics across different system boundaries.


Eclipse SmartHome is a framework for building smart home solutions. As such, it consists of a rich set of OSGi bundles that serve different purposes. Not all solutions that build on top of Eclipse SmartHome will require all of those bundles - instead they can choose what parts are interesting for them.


TANGO is a software toolkit for connecting things together, building control systems, and integrating system. It is free , open source and object-oriented. It is easy to use and is well adapted to solving simple and complex distributed problems. TANGO Controls has been used to build solutions for:

  • Distributed Control Systems (DCS) in which devices are controlled and monitored in a local distributed network

  • Supervisory Control And Data Acquisition (SCADA) systems in which remote devices are controlled and monitored centrally

  • Integrated Control Systems (ICS) in which different autonomous control systems are integrated into a central one

  • Interface Devices that run on small embedded platforms into a distributed control system

  • Internet of Things (IoT) applications in which arbitrary devices are controlled through the Internet

  • Machine to Machine (M2M) applications in which devices communicates with each other

  • System Integration Platforms in which different kind of software applications and systems are integrated into a central one

TANGO Controls is operating system independent and supports C++, Java and Python for all of the components.


Taurus is a python framework for both CLI and GUI tango applications. It is build on top of PyTango and PyQt. Taurus stands for TAngo User interface ‘R’ US.


IPFS is a distributed file system that seeks to connect all computing devices with the same system of files. In some ways, this is similar to the original aims of the Web, but IPFS is actually more similar to a single bittorrent swarm exchanging git objects.

It combines good ideas from Git, BitTorrent, Kademlia, SFS, and the Web. It is like a single bittorrent swarm, exchanging git objects. IPFS provides an interface as simple as the HTTP web, but with permanence built in.


IPredator provides you with an encrypted tunnel from your computer to the Internet. We are hiding your real IP address behind one of ours.


Iris seeks to provide a powerful, easy to use, and community-driven Python library for analysing and visualising meteorological and oceanographic data sets.


IRPF90 is a Fortran programming environment which helps the development of large Fortran codes by applying the Implicit Reference to Parameters method (IRP).

In Fortran programs, the programmer has to focus on the order of the instructions: before using a variable, the programmer has to be sure that it has already been computed in all possible situations. For large codes, it is common source of error.

In IRPF90 most of the order of instructions is handled by the pre-processor, and an automatic mechanism guarantees that every entity is built before being used. This mechanism relies on the needs/needed by relations between the entities, which are built automatically.

Codes written with IRPF90 execute often faster than Fortran programs, are faster to write and easier to maintain.


A dynamic computer programming language.[5] It is most commonly used as part of web browsers, whose implementations allow client-side scripts to interact with the user, control the browser, communicate asynchronously, and alter the document content that is displayed.[5] It is also used in server-side network programming with runtime environments such as Node.js, game development and the creation of desktop and mobile applications. With the rise of the single-page web app and JavaScript-heavy sites, it is increasingly being used as a compile target for source-to-source compilers from both dynamic languages and static languages. In particular, Emscripten and highly optimised JIT compilers, in tandem with asm.js which is friendly to AOT compilers like OdinMonkey, have enabled C and C++ programs to be compiled into JavaScript and execute at near-native speeds, making JavaScript be considered the "assembly language of the web",[6] according to its creator and others.


A low-level, extraordinarily optimizable subset of JavaScript. It is an intermediate programming language consisting of a strict subset of the JavaScript language. It enables significant performance improvements for web applications that are written in statically-typed languages with manual memory management (such as C) and then translated to JavaScript by a source-to-source compiler. Asm.js does not aim to improve the performance of hand-written JavaScript code, nor does it enable anything other than enhanced performance.

It is intended to have performance characteristics closer to that of native code than standard JavaScript by limiting language features to those amenable to ahead-of-time optimization and other performance improvements.[2] By using a subset of JavaScript, asm.js is already supported by all major web browsers,[3] unlike alternative approaches such as Google Native Client. Mozilla Firefox was the first web browser to implement asm.js-specific optimizations, starting with Firefox 22.[4] The optimizations of Google Chrome’s V8 JavaScript engine in Chrome 28 made asm.js benchmarks more than twice as fast as prior versions of Chrome.

See Emscripten.


DynJS is an ECMAScript runtime for the JVM.


Nashorn’s goal is to implement a lightweight high-performance JavaScript runtime in Java with a native JVM. This Project intends to enable Java developers embedding of JavaScript in Java applications via JSR-223 and to develop free standing JavaScript applications using the jrunscript command-line tool.


Rhino is an open-source implementation of JavaScript written entirely in Java. It is typically embedded into Java applications to provide scripting to end users. It is embedded in J2SE 6 as the default Java scripting engine.


SpiderMonkey is Mozilla’s JavaScript engine written in C/C++. It is used in various Mozilla products, including Firefox, and is available under the MPL2.

SpiderMonkey is the code name for the first-ever JavaScript engine, written by Brendan Eich at Netscape Communications, later released as open source and now maintained by the Mozilla Foundation. SpiderMonkey provides JavaScript support for Mozilla Firefox and various embeddings such as the GNOME 3 desktop.

Eich "wrote JavaScript in ten days" in 1995, having been "recruited to Netscape with the promise of doing Scheme in the browser". (The idea of using Scheme was abandoned when "engineering management [decided] that the language must ‘look like Java’".) In the fall of 1996, Eich, needing to "pay off [the] substantial technical debt" left from the first year, "stayed home for two weeks to rewrite Mocha as the codebase that became known as SpiderMonkey". The name SpiderMonkey was chosen as a reference to the movie Beavis and Butt-head Do America, in which the character Tom Anderson mentions that the title characters were "whacking off like a couple of spider monkeys." In 2011, Eich transferred management of the SpiderMonkey code to Dave Mandelin.


The V8 JavaScript Engine is an open source JavaScript engine developed by Google for the Google Chrome web browser. V8 compiles JavaScript to native machine code (IA-32, x86-64, ARM, or MIPS ISAs) before executing it, instead of more traditional techniques such as interpreting bytecode or compiling the whole program to machine code and executing it from a filesystem. The compiled code is additionally optimized (and re-optimized) dynamically at runtime, based on heuristics of the code’s execution profile. Optimization techniques used include inlining, elision of expensive runtime properties, and inline caching, among many others.


Jekyll is a simple, blog-aware, static site generator perfect for personal, project, or organization sites. Think of it like a file-based CMS, without all the complexity. Jekyll takes your content, renders Markdown and Liquid templates, and spits out a complete, static website ready to be served by Apache, Nginx or another web server. Jekyll is the engine behind GitHub Pages, which you can use to host sites right from your GitHub repositories.


Complex systems are increasingly being viewed as distributed information processing systems, particularly in the domains of computational neuroscience, bioinformatics and Artificial Life. This trend has resulted in a strong uptake in the use of (Shannon) information-theoretic measures to analyse the dynamics of complex systems in these fields. We introduce the Java Information Dynamics Toolkit (JIDT): a Google code project which provides a standalone, (GNU GPL v3 licensed) open-source code implementation for empirical estimation of information-theoretic measures from time-series data. While the toolkit provides classic information-theoretic measures (e.g. entropy, mutual information, conditional mutual information), it ultimately focusses on implementing higher-level measures for information dynamics. That is, JIDT focusses on quantifying information storage, transfer and modification, and the dynamics of these operations in space and time. For this purpose, it includes implementations of the transfer entropy and active information storage, their multivariate extensions and local or pointwise variants. JIDT provides implementations for both discrete and continuous-valued data for each measure, including various types of estimator for continuous data (e.g. Gaussian, box-kernel and Kraskov-Stoegbauer-Grassberger) which can be swapped at run-time due to Java’s object-oriented polymorphism. Furthermore, while written in Java, the toolkit can be used directly in MATLAB, GNU Octave, Python and other environments. We present the principles behind the code design, and provide several examples to guide users.


Joblib provides a simple helper class to write parallel for loops using multiprocessing. The core idea is to write the code to be executed as a generator expression, and convert it to parallel computing.


Jolie is an open-source programming language for developing distributed applications based on microservices. In the programming paradigm proposed with Jolie, each program is a service that can communicate with other programs by sending and receiving messages over a network.


In a decentralized computing environment, it’s a better practice to pass programming codes to various machines to execute (and then gather the results) when the application is dealing with huge amount of data. However, how can machines of various configurations understand each other? Also, the "moving code, least moving data" policy may work better with functional programming than imperative programming.

Those questions/issues lead to the idea of doing functional programming in JSON. If programs can be coded in JSON, they can be easily shipped around and understood by machines of vaious settings. Combining JSON and functional programming also makes security issues easier to track or manage.

JSON-FP is part of an attempt to make data freely and easily accessed, distributed, annotated, meshed, even re-emerged with new values. To achieve that, it’s important to be able to ship codes to where data reside, and that’s what JSON-FP is trying to achieve.


A high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The library, largely written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing.


The next generation of IPython notebooks. IPython will continue to exist as a Python kernel for Jupyter, but the notebook and other language-agnostic parts of IPython will move to new projects under the Jupyter name. IPython 3.0 will be the last monolithic release of IPython.


This repository contains custom Contents classes that allows IPython to use Google Drive for file management. The code is a organized as a python package that contains functions to install a Jupyter Notebook JavaScript extension, and activate/deactivate different IPython profiles to be used with Google drive.


Multi-user server for Jupyter notebooks.


Jupyter nbviewer is the web application behind The Jupyter Notebook Viewer, which is graciously hosted by Rackspace. Run this locally to get most of the features of nbviewer on your own network.


A Java virtual machine (JVM) is an abstract computing machine. There are three notions of the JVM: specification, implementation, and instance. The specification is a book that formally describes what is required of a JVM implementation. Having a single specification ensures all implementations are interoperable. A JVM implementation is a computer program that implements requirements of the JVM specification in a compliant and preferably performant manner. An instance of the JVM is a process that executes a computer program compiled into Java bytecode.


Kahler, a Python library that implements discrete exterior calculus on arbitrary Hermitian manifolds. Borrowing techniques and ideas first implemented in PyDEC, Kahler provides a uniquely general framework for computation using discrete exterior calculus. Manifolds can have arbitrary dimension, topology, bilinear Hermitian metrics, and embedding dimension. Kahler comes equipped with tools for generating triangular meshes in arbitrary dimensions with arbitrary topology. Kahler can also generate discrete sharp operators and implement de Rham maps. Computationally intensive tasks are automatically parallelized over the number of cores detected. The program itself is written in Cython—​a superset of the Python language that is translated to C and compiled for extra speed. Kahler is applied to several example problems: normal modes of a vibrating membrane, electromagnetic resonance in a cavity, the quantum harmonic oscillator, and the Dirac-Kahler equation. Convergence is demonstrated on random meshes.


KBLAS (KAUST-BLAS) is a small open-source library that optimizes critical numerical kernels on CUDA-enabled GPUs. KBLAS provides a subset of standard BLAS functions. It also proposes some function with BLAS-like interface that target both single and multi- GPU systems.

The ultimate goal for KBLAS is performance. KBLAS has a set of tuning parameters that affect its performance according to the GPU architecture, and the CUDA runtime version. While we cannot guarantee optimal performance with the default tuning parameters, the user can easily edit such parameters on his local system. KBLAS might be shipped with autotuners in the future.


Visualize logs and time-stamped data. Elasticsearch works seamlessly with Kibana to let you see and interact with your data.


The KSTAR project supports the development of Klang, a source-to-source compiler that turns C programs with OpenMP pragmas to C programs with calls to either the StarPU or the Kaapi runtime system.


KVM (for Kernel-based Virtual Machine) is a full virtualization solution for Linux on x86 hardware containing virtualization extensions (Intel VT or AMD-V). It consists of a loadable kernel module, kvm.ko, that provides the core virtualization infrastructure and a processor specific module, kvm-intel.ko or kvm-amd.ko. KVM also requires a modified QEMU although work is underway to get the required changes upstream.

Using KVM, one can run multiple virtual machines running unmodified Linux or Windows images. Each virtual machine has private virtualized hardware: a network card, disk, graphics adapter, etc.

The kernel component of KVM is included in mainline Linux, as of 2.6.20.


Kimchi is an HTML5 based management tool for KVM. It is designed to make it as easy as possible to get started with KVM and create your first guest.

Kimchi runs as a daemon on the hypervisor host. It manages KVM guests through libvirt. The management interface is accessed over the web using a browser that supports HTML5.


A WordPress site builder.


LCS Tool performs computations for the analysis of Lagrangian coherent structures.


Lea is a Python package aiming at working with discrete probability distributions in an intuitive way. It allows you to model a broad range of random phenomenons, like dice throwing, coin tossing, gambling, weather, finance, etc. More generally, Lea may be used for any finite set of discrete values having known probability: numbers, booleans, date/times, symbols, … Each probability distribution is modeled as a plain object, which can be named, displayed, queried or processed to produce new probability distributions.


Leaflet is a modern open-source JavaScript library for mobile-friendly interactive maps.


Authors and publishers use Leanpub to publish amazing in-progress and completed books.


We introduce Lemon, an MPI parallel I/O library that provides efficient parallel I/O of both binary and metadata on massively parallel architectures. Motivated by the demands of the Lattice Quantum Chromodynamics community, the data is stored in the SciDAC Lattice QCD Interchange Message Encapsulation format. This format allows for storing large blocks of binary data and corresponding metadata in the same file. Even if designed for LQCD needs, this format might be useful for any application with this type of data profile.


A free/open source C++ library that provides implementations of various (approximate) inference methods for discrete graphical models. libDAI supports arbitrary factor graphs with discrete variables; this includes discrete Markov Random Fields and Bayesian Networks. libDAI is not intended to be a complete package for approximate inference. Instead, it should be considered as an "inference engine", providing various inference methods. In particular, it contains no GUI, currently only supports its own file format for input and output (although it can read files in FastInf format.


libeemd is a C library for performing the ensemble empirical mode decomposition (EEMD), its complete variant (CEEMDAN) or the regular empirical mode decomposition (EMD). It includes a Python interface called pyeemd. The details of what libeemd actually computes are available as a separate article, which you should read if you are unsure about what EMD, EEMD and CEEMDAN are.


A set of tools for accessing and modifying virtual machine (VM) disk images. You can use this for viewing and editing files inside guests, scripting changes to VMs, monitoring disk used/free statistics, creating guests, P2V, V2V, performing backups, cloning VMs, building VMs, formatting disks, resizing disks, and much more.

libguestfs can access almost any disk image imaginable. It can do it securely — without needing root and with multiple layers of defence against rogue disk images. It can access disk images on remote machines or on CDs/USB sticks. It can access proprietary systems like VMware and Hyper-V.

All this functionality is available through a scriptable shell called guestfish, or an interactive rescue shell virt-rescue.

libguestfs is a C library that can be linked with C and C++ management programs and has bindings for about a dozen other programming languages. Using our FUSE module you can also mount guest filesystems on the host.


A multi-platform support library with a focus on asynchronous I/O. It was primarily developed for use by Node.js, but it’s also used by Luvit, Julia, pyuv, and others.


A toolkit to interact with the virtualization capabilities of recent versions of Linux (and other OSes).


Lighthouse is a framework for creating, maintaining, and using a taxonomy of available software that can be used to build highly-optimized matrix algebra computations. The taxonomy provides an organized anthology of software components and programming tools needed for that task. The taxonomy will serve as a guide to practitioners seeking to learn what is available for their programming tasks, how to use it, and how the various parts fit together. It builds upon and improves existing collections of numerical software, adding tools for the tuning of matrix algebra computations.


Limulus is an acronym for LInux MULti-core Unified Supercomputer. The Limulus project goal is to create and maintain an open specification and software stack for a personal workstation cluster. Ideally, a user should be able to build or purchase a small personal workstation cluster using the Limulus reference design and low cost hardware. In addition, a freely available turn-key Linux based software stack will be created and maintained for use on the Limulus design. A Limulus is inteneded to be a workstation cluster platform where users can develop software, test ideas, run small scale applications, and teach HPC methods.


LinuxCNC (the Enhanced Machine Control) is a software system for computer control of machine tools such as milling machines and lathes. It provides:

  • several graphical user interfaces including one for touch screens

  • an interpreter for "G-code" (the RS-274 machine tool programming language)

  • a realtime motion planning system with look-ahead

  • operation of low-level machine electronics such as sensors and motor drives

  • an easy to use "breadboard" layer for quickly creating a unique configuration for your machine

  • a software PLC programmable with ladder diagrams

  • easy installation with .deb packages or a Live-CD

It does not provide drawing (CAD - Computer Aided Design) or G-code generation from the drawing (CAM - Computer Automated Manufacturing) functions.

It can simultaneously move up to 9 axes and supports a variety of interfaces. The control can operate true servos (analog or PWM) with the feedback loop closed by the LinuxCNC software at the computer, or open loop with "step-servos" or stepper motors. Motion control features include: cutter radius and length compensation, path deviation limited to a specified tolerance, lathe threading, synchronized axis motion, adaptive feedrate, operator feed override, and constant velocity control. Support for non-Cartesian motion systems is provided via custom kinematics modules. Available architectures include hexapods (Stewart platforms and similar concepts) and systems with rotary joints to provide motion such as PUMA or SCARA robots. LinuxCNC runs on Linux using real time extensions. Support currently exists for version 2.4 and 2.6 Linux kernels with real time extensions applied by RT-Linux or RTAI patches.


Livingstone2 is a reusable artificial intelligence (AI) software system designed to assist spacecraft, life support systems, chemical plants or other complex systems in operating robustly with minimal human supervision, even in the face of hardware failures or unexpected events. Livingstone2 diagnoses the current state of the spacecraft or other system and recommends commands or repair actions that will allow the system to continue operations.

Livingstone2 is an enhancement and re-engineering of the Livingstone diagnosis system that was flight tested on-board the Deep Space One spacecraft in May 1999. It contains significant enhancements to robustness, performance and usability. Livingstone2 is able to track multiple diagnostic hypotheses, as opposed to a single hypothesis in Livingstone. It is also able to revise diagnostic decisions made in the past when additional observations become available. In such cases, Livingstone might find the incorrect hypothesis. These improvements increase robustness.

Re-architecting and re-implementing the system in C++ has increased performance. Usability has been vastly improved by creating a set of development tools which are closely integrated with the Livingstone2 engine. In addition to the core diagnosis engine, Livingstone2 now includes a compiler than translates diagnostic models written in a Java-like language into Livingstone2’s language, and a broad set of graphical tools for model development. These software tools support the rapid deployment of model-based representations of complex systems for Livingstone2 via a visual model builder/tester (Stanley), and two graphical user interface tools (Candidate Manager and History Table) which provide Livingstone2 status information during testing. Runtime support is provided by the real-time interface (RTI) which converts analog sensor readings to the digital values required by Livingstone2.

Also included in the Livingstone2 download is Oliver, a prototype model builder/tester, which is however incomplete, but could be used as a starting place for a new model builder/tester.


The LLVM compiler infrastructure project (formerly Low Level Virtual Machine) is a compiler infrastructure designed as a set of reusable libraries with well-defined interfaces. It is written in C and is designed for compile-time, link-time, run-time, and "idle-time" optimization of programs written in arbitrary programming languages. Originally implemented for C and C, the language-agnostic design (and the success) of LLVM has since spawned a wide variety of front ends: languages with compilers that use LLVM include Common Lisp, ActionScript, Ada, D, Fortran, OpenGL Shading Language, Go, Haskell, Java bytecode, Julia, Objective-C, Swift, Python, Ruby, Rust, Scala, C# and Lua.


This project aims to fully build the Linux kernel using Clang which is the C front end for the LLVM compiler infrastructure project. Together Clang and LLVM have many positive attributes and features which many developers and system integrators would like to take advantage of when developing and deploying the Linux Kernel as a part of their own projects.


Pure is a modern-style functional programming language based on term rewriting. It offers equational definitions with pattern matching, full symbolic rewriting capabilities, dynamic typing, eager and lazy evaluation, lexical closures, built-in list and matrix support and an easy-to-use C interface. The interpreter uses LLVM as a backend to JIT-compile Pure programs to fast native code.


Implementation of the LLVM tutorial in Python.

Loopy lets you easily generate the tedious, complicated code that is necessary to get good performance out of GPUs and multi-core CPUs.

Loopy’s core idea is that a computation should be described simply and then transformed into a version that gets high performance. This transformation takes place under user control, from within Python.


LROSE is an NSF-backed project to develop common software for the LIDAR, RADAR and PROFILER community.


The Larval TRANSport Lagrangian model (LTRANS v.2b) is an off-line particle-tracking model that runs with the stored predictions of a 3D hydrodynamic model, specifically the Regional Ocean Modeling System (ROMS). Although LTRANS was built to simulate oyster larvae, it can easily be adapted to simulate passive particles and other planktonic organisms. LTRANS v.2 is written in Fortran 90 and is designed to track the trajectories of particles in three dimensions. It includes a 4th order Runge-Kutta scheme for particle advection and a random displacement model for vertical turbulent particle motion. Reflective boundary conditions, larval behavior, and settlement routines are also included.


LuxMark is a OpenCL benchmark tool. The idea for the program was conceived in 2009 by Jean-Francois Jromang Romang. It was intended as a promotional tool for LuxRender (to quote original Jromang’s words: "LuxRender propaganda with OpenCL"). The idea was quite simple, wrap SLG inside an easy to use graphical user interface and use it as a benchmark for OpenCL.


The Linux Virtual Server is a highly scalable and highly available server built on a cluster of real servers, with the load balancer running on the Linux operating system. The architecture of the server cluster is fully transparent to end users, and the users interact as if it were a single high-performance virtual server.

The real servers and the load balancers may be interconnected by either high-speed LAN or by geographically dispersed WAN. The load balancers can dispatch requests to the different servers and make parallel services of the cluster to appear as a virtual service on a single IP address, and request dispatching can use IP load balancing technolgies or application-level load balancing technologies. Scalability of the system is achieved by transparently adding or removing nodes in the cluster. High availability is provided by detecting node or daemon failures and reconfiguring the system appropriately.


The Magic Book Project is an open-source framework that facilitates the design and production of electronic and print books for authors. Rather than type into a word processor, the Magic Book Project allows an author to write a book once (using ASCIIDOC, a simple text document format) and procedurally generate the layout for a variety of formats using modern code-based design tools, such as CSS, the stylesheet standard. Write your book once, press a magic button, and out come multiple versions: printed hardcopy, digital PDF, HTML, MOBI, and EPUB.


Matrix algebra on GUP and multicore architectures. The MAGMA project aims to develop a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures, starting with current Multicore+GPU systems.


clMAGMA is an OpenCL port of MAGMA. It supports AMD GPUs. The clMAGMA library dependancies, in particular optimized GPU OpenCL BLAS and CPU optimized BLAS and LAPACK for AMD hardware, can be found in the AMD clMath Libraries (formerly APPML).


Mal is an Clojure inspired Lisp interpreter.

Mal is implemented in 26 different languages.

Mal is a learning tool. Each implementation of mal is separated into 11 incremental, self-contained (and testable) steps that demonstrate core concepts of Lisp. The last step is capable of self-hosting (running the mal implemenation of mal).


Mantevo is a multi-faceted application performance project. It provides application performance proxies known as miniapps. Miniapps combine some or all of the dominant numerical kernels contained in an actual stand-alone application. Miniapps include libraries wrapped in a test driver providing representative inputs. They may also be hard-coded to solve a particular test case so as to simplify the need for parsing input files and mesh descriptions. Mini apps range in scale from partial, performance-coupled components of the application to a simplified representation of a complete execution path through the application.


TeaLeaf is a mini-app that solves the linear heat conduction equation on a spatially decomposed regularly grid using a 5 point stencil with implicit solvers. TeaLeaf currently solves the equations in two dimensions, but three dimensional support is in beta.

The solvers have been written in Fortran with OpenMP and MPI and they have also been ported to OpenCL to provide an accelerated capability. Other versions invoke third party linear solvers and currently include Petsc, Trilinos and Hypre, which are in beta release. For each of these version there are instructions on how to download, build and link in the relevant library.

Mapbox Studio

Mapbox Studio gives you instant streaming access to massive global datasets like Mapbox Streets, Mapbox Terrain, and Mapbox Satellite without importing any data onto your computer.

Create your own vector tiles using Mapbox Studio. Convert data from traditional formats (Shapefile, GeoJSON, KML, GPX) and upload directly to Mapbox to deploy your vector tiles at scale.


MathFu is a C++ math library developed primarily for games focused on simplicity and efficiency.

It provides a suite of vector, matrix and quaternion classes to perform basic geometry suitable for game developers. This functionality can be used to construct geometry for graphics libraries like OpenGL or perform calculations for animation or physics systems.


Maven is a build automation tool used primarily for Java projects. The word maven means accumulator of knowledge in Yiddish.[3] Maven addresses two aspects of building software: First, it describes how software is built, and second, it describes its dependencies. Contrary to preceding tools like Apache Ant, it uses conventions for the build procedure, and only exceptions need to be written down. An XML file describes the software project being built, its dependencies on other external modules and components, the build order, directories, and required plug-ins. It comes with pre-defined targets for performing certain well-defined tasks such as compilation of code and its packaging. Maven dynamically downloads Java libraries and Maven plug-ins from one or more repositories such as the Maven 2 Central Repository, and stores them in a local cache.[4] This local cache of downloaded artifacts can also be updated with artifacts created by local projects. Public repositories can also be updated.


MBDyn is the first and possibly the only free* general purpose Multibody Dynamics analysis software. It features the integrated multidisciplinary simulation of multibody, multiphysics systems, including nonlinear mechanics of rigid and flexible bodies (geometrically exact & composite-ready beam and shell finite elements, component mode synthesis elements, lumped elements) subjected to kinematic constraints, along with smart materials, electric networks, active control, hydraulic networks, and essential fixed-wing and rotorcraft aerodynamics.

MBDyn simulates the behavior of heterogeneous mechanical, aeroservoelastic systems based on first principles equations. It can be easily coupled to external solvers for co-simulation of multiphysics problems, e.g. Computational Fluid Dynamics (CFD), terradynamics, block-diagram solvers like Scicos, Scicoslab and Simulink, using a simple C, C++ or Python peer-side API.

MBDyn is being actively developed and used in the aerospace (aircraft, helicopters, tiltrotors, spacecraft), wind energy (wind turbines), automotive (cars, trucks) and mechatronic fields (industrial robots, parallel robots, micro aerial vehicles (MAV)) for the analysis and simulation of the dynamics of complex systems.


Morse decompositions for piecewise constant vector fields.


MediaGoblin is a free software media publishing platform that anyone can run. You can think of it as a decentralized alternative to Flickr, YouTube, SoundCloud, etc.


A new innovative Python implementation harnessing Google’s super fast Dart Virtual Machine running Python at near native speeds.


A global optimization software tool that integrates two prominent population-based stochastic algorithms, namely Particle Swarm Optimization and Differential Evolution, with well established efficient local search procedures made available via the Merlin optimization environment. The resulting hybrid algorithms, also referred to as Memetic Algorithms, combine the space exploration advantage of their global part with the efficiency asset of the local search, and as expected they have displayed a highly efficient behavior in solving diverse optimization problems. The proposed software is carefully parametrized so as to offer complete control to fully exploit the algorithmic virtues. It is accompanied by comprehensive examples and a large set of widely used test functions, including tough atomic cluster and protein conformation problems.


Mercurium is a source-to-source compilation infrastructure aimed at fast prototyping. Current supported languages are C, C++. Mercurium is mainly used in Nanos environment to implement OpenMP but since it is quite extensible it has been used to implement other programming models or compiler transformations, examples include Cell Superscalar, Software Transactional Memory, Distributed Shared Memory or the ACOTES project, just to name a few.

Extending Mercurium is achieved using a plugin architecture, where plugins represent several phases of the compiler. These plugins are written in C++ and dynamically loaded by the compiler according to the chosen configuration. Code transformations are implemented in terms of source code (there is no need to modify or know the internal syntactic representation of the compiler).


DLB is a dynamic library designed to speed up hybrid applications by improving its load balance. DLB will redistribute the computational resources of the second level of parallelism to improve the load balance of the outer level of parallelism.


OmpSs is an effort to integrate features from the StarSs programming model developed by BSC into a single programming model. In particular, our objective is to extend OpenMP with new directives to support asynchronous parallelism and heterogeneity (devices like GPUs). However, it can also be understood as new directives extending other accelerator based APIs like CUDA or OpenCL. Our OmpSs environment is built on top of our Mercurium compiler and Nanos++ runtime system.


Nanos++ is a runtime designed to serve as runtime support in parallel environments. It is mainly used to support OmpSs, a extension to OpenMP developed at BSC. It also has modules to support OpenMP and Chapel.

Nanos++ provides services to support task parallelism using synchronizations based on data-dependencies. Data parallelism is also supported by means of services mapped on top of its task support. Task are implemented as user-level threads when possible (currently x86,x86-64,ia64,ppc32 and ppc64 are supported).

Nanos++ also provides support for maintaining coherence across different address spaces (such as with GPUs or cluster nodes). It provides software directory and cache modules to this end.


MeteoIO can be seen as a set of modules that is focused on the handling of input/output operations (including data preparation) for numerical simulations in the realm of earth sciences. On the visible side, it offers the following modules, working on a pre-determined set of meteorological parameters or on parameters added by the developer:

  • a set of plugins for accessing the data (for example, a plugin might be responsible for fetching the raw data from a given database)

  • a set of filters and processing elements for applying transformations to the data (for example, a filter might remove all data that is out of range)

  • a set of resampling algorithms to temporally interpolate the data at the required timestamp

  • a set of parametrizations to generate data/meteorological parameters when they could not be interpolated

  • a set of spatial interpolation algorithms (for example, such an algorithm might perform Inverse Distance Weighting for filling a grid with spatially interpolated data)

Each of these steps can be configured and fine tuned according to the needs of the model and the wishes of the user.


Meteor is an ultra-simple environment for building modern websites. What once took weeks, even with the best tools, now takes hours with Meteor.

The web was originally designed to work in the same way that mainframes worked in the 70s. The application server rendered a screen and sent it over the network to a dumb terminal. Whenever the user did anything, that server rerendered a whole new screen. This model served the Web well for over a decade. It gave rise to LAMP, Rails, Django, PHP.

But the best teams, with the biggest budgets and the longest schedules, now build applications in JavaScript that run on the client. These apps have stellar interfaces. They don’t reload pages. They are reactive: changes from any client immediately appear on everyone’s screen.

They’ve built them the hard way. Meteor makes it an order of magnitude simpler, and a lot more fun. You can build a complete application in a weekend, or a sufficiently caffeinated hackathon. No longer do you need to provision server resources, or deploy API endpoints in the cloud, or manage a database, or wrangle an ORM layer, or swap back and forth between JavaScript and Ruby, or broadcast data invalidations to clients.


The simulation and parameter optimization of coupled ocean circulation and ecosystem models in three space dimensions is one of the most challenging tasks in numerical climate research. Here we present a scientific toolkit that aims at supporting researchers by defining clear coupling interfaces, providing state-of-the-art numerical methods for simulation, parallelization and optimization while using only freely available and (to a great extend) platform-independent software. Besides defining a user-friendly coupling interface (API) for marine ecosystem or biogeochemical models, we heavily rely on the Portable, Extensible Toolkit for Scientific computation (PETSc) developed at Argonne Nat. Lab. for a wide variety of parallel linear and non-linear solvers and optimizers. We specifically focus on the usage of matrix-free Newton-Krylov methods for the fast computation of steady periodic solutions, and make use of the Transport Matrix Method (TMM).


micro-CernVM is the heart of the CernVM 3 virtual appliance. It is based on Scientific Linux 6 combined with a custom, virtualization-friendly Linux kernel. This image is also fully RPM based; you can use yum and rpm to install additional packages.

micro-CernVM’s outstanding feature is that it does not require a hard disk image to be distributed (hence "micro"). Instead it is distributed as a CD-ROM image of ~10MB containing a Linux kernel and the CernVM-FS client. The rest of the operating system is downloaded and cached on demand by CernVM-FS. The virtual machine still requires a hard disk as a persistent cache, but this hard disk is initially empty and can be created instantaneously, instead of being pre-created and distributed.


A Lisp implemented in < 1 KB of JavaScript with macros, TCO, interop and exception handling.


Mirage OS is a library operating system that constructs unikernels for secure, high-performance network applications across a variety of cloud computing and mobile platforms. Code can be developed on a normal OS such as Linux or MacOS X, and then compiled into a fully-standalone, specialised unikernel that runs under the Xen hypervisor.

Since Xen powers most public cloud computing infrastructure such as Amazon EC2 or Rackspace, this lets your servers run more cheaply, securely and with finer control than with a full software stack.

Mirage uses the OCaml language, with libraries that provide networking, storage and concurrency support that work under Unix during development, but become operating system drivers when being compiled for production deployment. The framework is fully event-driven, with no support for preemptive threading.


A parallelized Python library for finding modal decompositions and reduced-order models. Parallel implementations of the proper orthogonal decomposition (POD), balanced POD (BPOD), dynamic mode decomposition (DMD), and Petrov-Galerkin projection are provided, as well as serial implementations of the Observer Kalman filter Identification method (OKID) and the Eigensystem Realization Algorithm (ERA). Modred is applicable to a wide range of problems and nearly any type of data.


A next generation web framework for the Perl programming language.


Mondrian is a general purpose statistical data-visualization system. It features outstanding interactive visualization techniques for data of almost any kind, and has particular strengths, compared to other tools, for working with Categorical Data, Geographical Data and LARGE Data.

All plots in Mondrian are fully linked, and offer many interactions and queries. Any case selected in a plot in Mondrian is highlighted in all other plots.

Currently implemented plots comprise Histograms, Boxplots y by x, Scatterplots, Barcharts, Mosaicplots, Missing Value Plots, Parallel Coordinates/Boxplots, SPLOMs and Maps.

Mondrian works with data in standard tab-delimited or comma-separated ASCII files and can load data from R workspaces. There is basic support for working directly on data in Databases (please email for further info).

Mondrian is written in JAVA and is distributed as a native application (wrapper) for MacOS X and Windows. Linux users need to start the jar-file.


MongoDB (from humongous) is a cross-platform document-oriented database. Classified as a NoSQL database, MongoDB eschews the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas (MongoDB calls the format BSON), making the integration of data in certain types of applications easier and faster.


D4 is an automated tool for a generating distributed document database designs for applications running on MongoDB. This tool specifically targets applications running highly concurrent workloads, and thus its designs are tailored to the unique properties of large-scale, Web-based applications. It can also be used to assist in porting MySQL-based applications to MongoDB.

Using a sample workload trace from a either a document-oriented or relational database application, D4 will compute the best a database design that optimizes the throughput and latency of a document DBMS.


Mopidy is an extensible music server written in Python.

Mopidy plays music from local disk, Spotify, SoundCloud, Google Play Music, and more. You edit the playlist from any phone, tablet, or computer using a range of MPD and web clients.


MORSE is an generic simulator for academic robotics. It focuses on realistic 3D simulation of small to large environments, indoor or outdoor, with one to tenths of autonomous robots.

MORSE can be entirely controlled from the command-line. Simulation scenes are generated from simple Python scripts.

MORSE comes with a set of standard sensors (cameras, laser scanner, GPS, odometry,…​), actuators (speed controllers, high-level waypoints controllers, generic joint controllers) and robotic bases (quadrotors, ATRV, Pioneer3DX, generic 4 wheel vehicle, PR2,…​). New ones can easily be added.

MORSE rendering is based on the Blender Game Engine. The OpenGL-based Game Engine supports shaders, provides advanced lightning options, supports multi-texturing, and use the state-of-the-art Bullet library for physics simulation.


MoviePy is a Python module for video editing, which can be used for basic operations (like cuts, concatenations, title insertions), video compositing (a.k.a. non-linear editing), video processing, or to create advanced effects. It can read and write the most common video formats, including GIF.


Comparison among OOP versions of an MPDATA code written using Python, Fortran and C++.


The mpld3 project brings together Matplotlib, the popular Python-based graphing library, and D3js, the popular Javascript library for creating interactive data visualizations for the web. The result is a simple API for exporting your matplotlib graphics to HTML code which can be used within the browser, within standard web pages, blogs, or tools such as the IPython notebook.


In a first course to classical mechanics elementary physical processes like elastic two-body collisions, the mass–spring model, or the gravitational two-body problem are discussed in detail. The continuation to many-body systems, however, is deferred to graduate courses although the underlying equations of motion are essentially the same and although there is a strong motivation for high-school students in particular because of the use of particle systems in computer games. The missing link between the simple and the more complex problem is a basic introduction to solve the equations of motion numerically which could be illustrated, however, by means of the Euler method. The many-particle physics simulation package MPPhys offers a platform to experiment with simple particle simulations. The aim is to give a principle idea how to implement many-particle simulations and how simulation and visualization can be combined for interactive visual explorations.


MPWide is a light-weight communication library for distributed computing. It is specifically developed to allow message passing over long-distance networks using path-specific optimizations.


Morse-Smale Complex Extraction, Exploration, and Reasoning is a set of tools and libraries for feature extraction and exploration in scalar fields. MSCEER computes a gradient-based abstract representation of a scalar field.


The mscomplex3d consists of two modules for computation and analysis of Morse-Smale complexes on 3d grids. The first is a command line exec named mscomplex3d. The second is a python loadable module named pyms3d. The Morse-Smale complex is a topological data structure that partitions datasets based on the gradients of an input scalar function. See here for a quick introduction on Morse-Smale complexes. This website presents software that computes the Morse-Smale complex of scalar functions defined on 3D Structured Grids and 2D triangle meshes.


MTT comprises a set of tools for modelling dynamic physical systems using the bond-graph methodology and transforming these models into representations suitable for analysis, control and simulation.


The Multiscale Coupling Library and Environment is a portable framework to do multiscale modeling and simulation on distributed computing resources. The generic coupling mechanism of MUSCLE is suitable for many types of multiscale applications, notably for multiscale models as defined by the MAPPER project or complex automata as defined in the COAST project. Submodels can be implemented from scratch, but legacy code can also be used with only minor adjustments. The runtime environment solves common problems in distributed computing and couples submodels of a multiscale model, whether they are built for high-performance supercomputers or for local execution. MUSCLE supports Java, C, C++, Fortran, Python, MATLAB and Scala code, using MPI, OpenMP, or threads.


Copies between local file systems are a daily activity. Files are constantly being moved to locations accessible by systems with different functions and/or storage limits, being backed up and restored, or being moved due to upgraded and/or replaced hardware. Hence, maximizing the performance of copies as well as checksums that ensure the integrity of copies is desirable to minimize the turnaround time of user and administrator activities. Modern parallel file systems provide very high performance for such operations using a variety of techniques such as striping files across multiple disks to increase aggregate I/O bandwidth and spreading disks across multiple servers to increase aggregate interconnect bandwidth.

To achieve peak performance from such systems, it is typically necessary to utilize multiple concurrent readers/writers from multiple systems to overcome various single-system limitations such as number of processors and network bandwidth. The standard cp and md5sum tools of GNU coreutils found on every modern Unix/Linux system, however, utilize a single execution thread on a single CPU core of a single system, hence cannot take full advantage of the increased performance of clustered file systems.

Mutil provides mcp and msum, which are drop-in replacements for cp and md5sum that utilize multiple types of parallelism to achieve maximum copy and checksum performance on clustered file systems. Multi-threading is used to ensure that nodes are kept as busy as possible. Read/write parallelism allows individual operations of a single copy to be overlapped using asynchronous I/O. Multi-node cooperation allows different nodes to take part in the same copy/checksum. Split file processing allows multiple threads to operate concurrently on the same file. Finally, hash trees allow inherently serial checksums to be performed in parallel.


The Muster library provides implementations of serial and parallel K-Medoids clustering algorithms. It is intended as a general framework for parallel cluster analysis, particularly for performance data analysis on systems with very large numbers of processes.

The parallel implementations in the Muster are designed to perform well even in environments where the data to be clustered is entirely distributed. For example, many performance tools need to analyze one data element from each process in a system. To analyze this data efficiently, clustering algorithms that move as little data as possible are required. In Muster, we exploit sampled clustering algorithms to realize this efficiency.

The parallel algorithms in Muster are implemented using the Message Passing Interface (MPI), making them suitable for use on many of the world’s largest supercomputers. They should, however, also run efficiently on your laptop.


Parallel wavelet compression.

National Data Service

The National Data Service is an emerging vision of how scientists and researchers across all disciplines can find, reuse, and publish data. It is an international federation of data providers, data aggregators, community-specific federations, publishers, and cyberinfrastructure providers. It builds on the data archiving and sharing efforts under way within specific communities and links them together with a common set of tools.


Navigation and estimation tools written in Python.


NcSOS adds an OGC SOS service to datasets in your existing THREDDS server. It complies with the IOOS SWE Milestone 1.0 templates and requires your datasets be in any of the CF 1.6 Discrete Sampling Geometries.

NcSOS acts like other THREDDS services (such an OPeNDAP and WMS) where as there are individual service endpoints for each dataset. It is best to aggregate your files and enable the NcSOS service on top of the aggregation. i.e. The NcML aggregate of hourly files from an individual station would be a good candidate to serve with NcSOS. Serving the individual hourly files with NcSOS would not be as beneficial.

You will need a working THREDDS installation of a least version 4.3.16 to run NcSOS.


The numerical differentiation library (NDL) used for the numerical estimation of first and second order partial derivatives of a function by finite differencing. In this version we have restructured the serial implementation of the code so as to achieve optimal task-based parallelization. The pure shared-memory parallelization of the library has been based on the lightweight OpenMP tasking model allowing for the full extraction of the available parallelism and efficient scheduling of multiple concurrent library calls. On multicore clusters, parallelism is exploited by means of TORC, an MPI-based multi-threaded tasking library. The new MPI implementation of NDL provides optimal performance in terms of function calls and, furthermore, supports asynchronous execution of multiple library calls within legacy MPI programs. In addition, a Python interface has been implemented for all cases, exporting the functionality of our library to sequential Python codes.


Neko is an high-level dynamicly typed programming language. It can be used as an embedded scripting language. It has been designed to provide a common runtime for several different languages. Learning and using Neko is very easy. You can easily extend the language with C libraries. You can also write generators from your own language to Neko and then use the Neko Runtime to compile, run, and access existing libraries.


Neo is minimal and fast Go Web Framework with extremely simple API.

During development you will enjoy in automatic reruning and recompiling your Neo application when source changes.

Build your Neo Application in few lines of code.


An open-source NoSQL graph database implemented in Java and Scala. With development starting in 2003, it has been publicly available since 2007. The source code and issue tracking are available on GitHub, with support readily available on Stack Overflow and the Neo4j Google group.

Neo4j implements the Property Graph Model efficiently down to the storage level. As opposed to graph processing or in-memory libraries, Neo4j provides full database characteristics including ACID transaction compliance, cluster support, and runtime failover, making it suitable to use graph data in production scenarios.

Neo4j’s free and open-source Community edition is a high-performance, fully ACID-transactional database. The Community edition includes (but is not limited to) all the functionality described previously in this section.


GraphGists are an easy way to create and share documents containing not just prose, structure and pictures but most importantly example graph models and use-cases expressed in Neo4j’s query language Cypher. These documents are written in AsciiDoc — the simple, textual markup language — and rendered in your browser as rich and interactive web pages that you can quickly evolve from describing simple howtos or questions to providing an extensive use-case specification.


The NESToolbox is a collection of algorithms to perform similarity estimation for irregularly sampled time series as they arise for example in the geosciences. It is implemented as a toolbox for the widely used software MATLAB and the freely available open-source software OCTAVE.

The installation of the Python portation is simple: just copy the in your working directory.




A simple Fortran 90 interface to NetCDF reading and writing.


NetPIPE is a protocol independent performance tool that visually represents the network performance under a variety of conditions. It performs simple ping-pong tests, bouncing messages of increasing size between two processes, whether across a network or within an SMP system. Message sizes are chosen at regular intervals, and with slight perturbations, to provide a complete test of the communication system. Each data point involves many ping-pong tests to provide an accurate timing. Latencies are calculated by dividing the round trip time in half for small messages ( < 64 Bytes ).

Modules have been added for PVM, TCGMSG, and the 1-sided message-passing standards of MPI-2 and SHMEM. Low level modules have been developed to evaluate GM for Myrinet cards, the GPSHMEM implementation that brings the Cray SHMEM interface to other machines, the low-level ARMCI library, and the LAPI interface for IBM SP systems. Internal testing can be done using a new memcpy module.


NewMadeleine is the fourth incarnation of the Madeleine communication library. The new architecture aims at enabling the use of a much wider range of communication flow optimization techniques. Its design is entirely modular: drivers and optimization strategies are dynamically loadable software components, allowing experimentations with multiple approaches or on multiple issues with regard to processing communication flows.

The optimizing scheduler SchedOpt targets applications with irregular, multi-flow communication schemes such as found in the increasingly common application conglomerates made of multiple programming environments and coupled pieces of code, for instance. SchedOpt itself is easily extensible through the concepts of optimization strategies (what to optimize for, what the optimization goal is) expressed in terms of tactics (how to optimize to reach the optimization goal). Tactics themselves are made of basic communication flows operations such as packet merging or reordering.

The communication library is fully multi-threaded through its close integration with PIOMan. It manages concurrent communication operations from multiple libraries and from multiple threads. Its MPI implementation Mad-MPI fully supports the MPI_THREAD_MULTIPLE multi-threading level. It is available on Infiniband (ibverbs), Myrinet (MX and GM), TCP (sockets) and legacy SCI and Quadrics QsNet-2.


The Neo-Geography Toolkit (NGT) is a collection of tools for automated processing of geospatial data, including images and maps. It is capable of processing raw raster data from remote sensing instruments and transforming it into useful cartographic products, such as visible image base maps, topographic models, etc. Additionally, components of the NGT can perform data processing on extremely large geospatial data sets (up to several tens of terabytes) via parallel processing pipelines. Finally, it can also transform raw metadata (i.e. SPICE kernels and PDS image labels), vector data (e.g., 2D/3D shape files), and geo-tagged data sets into standard NeoGeography data formats, such as KML. NGT is an evolving collection of loosely connected open-source modules designed by the NASA Ames Intelligent Robotics Group. Modules of the NGT will be released one at a time, as they reach maturity. To date, we have completed only one module: the NASA Ames Stereo Pipeline, but more will soon follow.

The NASA Ames Stereo Pipeline (ASP) is a suite of free and open source automated geodesy and stereogrammetry tools designed for processing planetary and Earth imagery captured from satellites and robotic rovers. It produces cartographic products, including digital elevation models (DEMs), ortho-projected imagery, and 3D models. These data products are suitable for science analysis, mission planning, and public outreach.

Please install USGS ISIS version 3.4.7 if you would like to process NASA non-terrestrial imagery. Users wishing to process Digital Globe, GeoEye, or perspective imagery do not need to download anything else.


Nmag is a micromagnetic simulation package.


NodeBox for OpenGL is a free, cross-platform library for generating 2D animations with Python programming code. It is built on Pyglet and adopts the drawing API from NodeBox for Mac OS X. It has built-in support for paths, layers, motion tweening, hardware-accelerated image effects, simple physics and interactivity.

Taking inspiration from Processing, NodeBox lets the user get to work coding graphics using a simplified syntax, without worrying about the underlying technology. Unlike Processing, NodeBox is based on vector graphics rather than pixels. That means that it is an excellent tool for exploring 2D graphics intended for print, and in particular typographic experiments. The exported results take the form of PDF files, ready for use in Adobe Illustrator or any professional vector graphics package. NodeBox can also export Quicktime movies for animations.


Node.js is an open source, cross-platform runtime environment for server-side and networking applications. Node.js applications are written in JavaScript, and can be run within the Node.js runtime. Node.js provides an event-driven architecture and a non-blocking I/O API that optimizes an application’s throughput and scalability. These technologies are commonly used for real-time web applications.

Node.js uses the Google V8 JavaScript engine to execute code, and a large percentage of the basic modules are written in JavaScript. Node.js contains a built-in library to allow applications to act as a Web server without software such as Apache HTTP Server or IIS.


A package management system for client-side programming on the World Wide Web. It depends on Node.js and npm. It works with git and GitHub repositories.


The package manager for node.js. Quite useful, really.


Node-RED is a tool for wiring together hardware devices, APIs and online services in new and interesting ways. Node-RED provides a browser-based flow editor that makes it easy to wire together flows using the wide range nodes in the palette. Flows can be then deployed to the runtime in a single-click.

JavaScript functions can be created within the editor using the a rich text editor. A built-in library allows you to save useful functions, templates or flows for re-use.

The light-weight runtime is built on Node.js, taking full advantage of its event-driven, non-blocking model. This makes it ideal to run at the edge of the network on low-cost hardware such as the Raspberry Pi as well as in the cloud.

With over 120,000 modules in Node’s package repository, it is easy to extend the range of palette nodes to add new capabilities.


Running IPython notebook servers on Amazon’s EC2.


Numba gives you the power to speed up your applications with high performance functions written directly in Python. With a few annotations, array-oriented and math-heavy Python code can be just-in-time compiled to native machine instructions, similar in performance to C, C++ and Fortran, without having to switch languages or Python interpreters.

Numba works by generating optimized machine code using the LLVM compiler infrastructure at import time, runtime, or statically (using the included pycc tool). Numba supports compilation of Python to run on either CPU or GPU hardware, and is designed to integrate with the Python scientific software stack.

Now, the idea of Numba is the following. Take a Python function performing numerical operations on NumPy arrays. Normally, this function is interpreted by CPython. It performs Python and NumPy C API calls to execute these operations efficiently.

With Numba, things happen quite differently. At runtime, the function bytecode is analyzed, types are inferred, and LLVM IR is generated before being compiled to machine code. In nopython mode, the LLVM IR doesn’t make Python C API calls. There are many situations where a Python function cannot be compiled in nopython mode because it uses non-trivial Python features or data structures. In this case, the object mode is activated and the LLVM IR makes many Python C API calls.

We regularly use a number of analytic tools for computational neurosciences in our research. The codes for these analytic tools were developed and written by Stefan Fuertinger, PhD, and Joel Zinn, BA. The codes are open source software tools in Python. We are offering them as a free resource for the neuroscience community. All routines require NumPy, SciPy and Matplotlib to be installed; some functions optionally use Weave for embedded C++ code.

Brain network analysis. Routines to construct and analyze brain networks can be found under Network Tools. The Python module comprises all routines presented in this section.


Oasis is i high-level/high-performance Open Source Navier-Stokes solver written in Python. The solver has been found to scale well weakly up to 256 CPUs. Oasis is an open source finite element Navier-Stokes solver written from scratch in Python using building blocks from FEniCS. The solver is unstructured, runs with MPI and interfaces, through FEniCS, to the state-of-the-art linear algebra backend PETSc. Oasis advocates a high-level, programmable Python user interface, where the user is placed in complete control of every aspect of the solver.

There are currently two solvers implemented, one for steady-state and one for transient flows. The transient solver uses the fractional step algorithm for any finite element discretization of the actual Navier Stokes equations. The steady-state solver is coupled using a mixed space for velocity and pressure.


OCaml is the main implementation of the Caml programming language, created by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy and others in 1996. OCaml extends the core Caml language with object-oriented constructs.

OCaml’s toolset includes an interactive top level interpreter, a bytecode compiler, and an optimizing native code compiler. It has a large standard library that makes it useful for many of the same applications as Python or Perl, as well as robust modular and object-oriented programming constructs that make it applicable for large-scale software engineering. OCaml is the successor to Caml Light. The acronym CAML originally stood for Categorical Abstract Machine Language, although OCaml abandons this abstract machine.[1]

OCaml is a free open source project managed and principally maintained by INRIA. In recent years, many new languages have drawn elements from OCaml, most notably Fsharp and Scala.


The inability to predict lasting languages and architectures led us to develop OCCA, a C++ library focused on host-device interaction. Using run-time compilation and macro expansions, the result is a novel single kernel language that expands to multiple threading languages. Currently, OCCA supports device kernel expansions for the OpenMP, OpenCL, and CUDA platforms. Computational results using finite difference, spectral element and discontinuous Galerkin methods show OCCA delivers portable high performance in different architectures and platforms.

High-order finite-difference methods are commonly used in wave propagators for industrial subsurface imaging algorithms. Computational aspects of the reduced linear elastic vertical transversely isotropic propagator are considered. Thread parallel algorithms suitable for implementing this propagator on multi-core and many-core processing devices are introduced. Portability is addressed through the use of the OCCA runtime programming interface. Finally, performance results are shown for various architectures on a representative synthetic test case.


OpenCL implementations are provided as ICD (Installable Client Driver). An OpenCL program can use several ICD thanks to the use of an ICD Loader as provided by this project. This free ICD Loader can load any (free or non free) ICD.

This package aims at creating an Open Source alternative to vendor specific OpenCL ICD loaders. The main difficulties to create such software is that the order of function pointers in a structure is not publicy available. This software maintains a YAML database of all known and guessed entries. This package also delivers a skeleton of bindings to incorporate inside an OpenCL implementation to give it ICD functionalities.


The Open Community Runtime project is creating an application building framework that explores new methods of high-core-count programming. The initial focus is on HPC applications. Its goal is to create a tool that helps app developers improve the power efficiency, programmability, and reliability of their work while maintaining app performance.

OCR will help the app developer with the complex process of writing multi-core apps create by masking the effort to manage event-driven tasks, events (which embody dataflow and code flow dependencies), memory data blocks (with semantic annotations for runtime use), machine description facilities, and more.

This is a large open source project distributed under the GPL-2.0+ open source license. With a mature and established codebase containing almost 8 million lines of code, Linux ACPI is written largely in C. OCR was originally unveiled at Supercomputing Conference 2012 (SC12) with a major new release (v0.8) introduced at Supercomputing 2013 (SC13). Community participation is encouraged, both for runtime enhancement as well as exploration of algorithm/application decomposition for new programming models.


The octavemagic extension provides the ability to interact with Octave. It is provided by the oct2py package, which may be installed using pip or easy_install.


OData (Open Data Protocol) is an OASIS standard that defines the best practice for building and consuming RESTful APIs. OData helps you focus on your business logic while building RESTful APIs without having to worry about the approaches to define request and response headers, status codes, HTTP methods, URL conventions, media types, payload formats and query options etc. OData also guides you about tracking changes, defining functions/actions for reusable procedures and sending asynchronous/batch requests etc. Additionally, OData provides facility for extension to fulfil any custom needs of your RESTful APIs. OData RESTful APIs are easy to consume. The OData metadata, a machine-readable description of the data model of the APIs, enables the creation of powerful generic client proxies and tools. Some of them can help you interact with OData even without knowing anything about the protocol.


ODataPy is an open-source Python library that implements the Open Data Protocol (OData). It supports the OData protocol version 4.0. It is built on top of ODataCpp using language binding. It is under development and currently serves only parts of client and client side proxy generation (code gen) aspects of OData.


Odata Server with support for MySQL and for BLOBs managed by Leveled.


The OpenFabrics Enterprise Distribution (OFED™) is open-source software for RDMA and kernel bypass applications. OFED is used in business, research and scientific environments that require highly efficient networks, storage connectivity and parallel computing. The software provides high performance computing sites and enterprise data centers with flexibility and investment protection as computing evolves towards applications that require extreme speeds, massive scalability and utility-class reliability.

OFED includes kernel-level drivers, channel-oriented RDMA and send/receive operations, kernel bypasses of the operating system, both kernel and user-level application programming interface (API) and services for parallel message passing (MPI), sockets data exchange (e.g., RDS, SDP), NAS and SAN storage (e.g. iSER, NFS-RDMA, SRP) and file system/database systems.

The network and fabric technologies that provide RDMA performance with OFED include: legacy 10 Gigabit Ethernet, iWARP for Ethernet, RDMA over Converged Ethernet (RoCE), and 10/20/40 Gigabit InfiniBand.


OFF, an open source (free software) code for performing fluid dynamics simulations, is presented. The aim of OFF is to solve, numerically, the unsteady (and steady) compressible Navier–Stokes equations of fluid dynamics by means of finite volume techniques: the research background is mainly focused on high-order (WENO) schemes for multi-fluids, multi-phase flows over complex geometries. To this purpose a highly modular, object-oriented application program interface (API) has been developed. In particular, the concepts of data encapsulation and inheritance available within Fortran language (from standard 2003) have been stressed in order to represent each fluid dynamics “entity” (e.g. the conservative variables of a finite volume, its geometry, etc…) by a single object so that a large variety of computational libraries can be easily (and efficiently) developed upon these objects.


The OmicABEL (pronounced as "amicable") package allows rapid mixed-model based genome-wide association analysis; it efficiently handles large datasets, and both single trait and multiple trait ("omics") analysis.

CLAK-GWAS is a software for performing Genome-Wide Association Studies (GWAS). It provides a high-performance implementation of two algorithms, CLAK-Chol and CLAK-Eig, for GWAS involving single and multiple phenotypes, respectively.


Omni compiler is a collection of programs and libraries that allow users to build code transformation compilers. Omni Compiler is to translate C and Fortran programs with XcalableMP and/or OpenACC directives into parallel code suitable for compiling with a native compiler linked with the Omni Compiler runtime library.

Omni compiler consists of following components:

  • XcalableMP - XcalableMP is a directive-based language extension of C and Fortran for distributed memory systems. XcalableMP allows users to develop a parallel application and to tune its performance with minimal and simple notation.

  • OpenACC - OpenACC is a directive-based programming interface for accelerators such as GPGPU. OpenACC allows users to express the offloading of data and computations to accelerators to simplify the porting process for legacy CPU-based applications.

  • XcodeML - XcodeML is an intermediate code written in XML for C and Fortran languages.


OMP2HMPP a tool that, automatically translates a high-level C source code(OpenMP) code into HMPP. The generated version rarely will differs from a hand-coded HMPP version, and will provide an important speedup, near 113%, that could be later improved by hand-coded CUDA.


OMP2MPI automatically generates MPI source code from OpenMP. Allowing that the program exploits non shared-memory architectures such as cluster, or Network-on-Chip based(NoC-based) Multiprocessors-System-onChip (MPSoC). OMP2MPI gives a solution that allow further optimization by an expert that want to achieve better results. Tested set of problems obtains in most of cases with more than 20x of speedup for 64 cores compared to the sequential version and an average speedup over 4x compared to OpenMP.


An open-source framework for the execution of unstructured grid applications on clusters of GPUs or multi-core CPUs. P2 is an API with associated libraries and preprocessors to generate parallel executables for applications on unstructured grids.

The OP2 project is developing an open-source framework for the execution of unstructured grid applications on clusters of GPUs or multi-core CPUs. Although OP2 is designed to look like a conventional library, the implementation uses source-source translation to generate the appropriate back-end code for the different target platforms.


The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator. OpenACC is designed for portability across operating systems, host CPUs, and a wide range of accelerators, including APUs, GPUs, and many-core coprocessors. The directives and programming model defined in the OpenACC API document allow programmers to create high-level host+accelerator programs without the need to explicitly initialize the accelerator, manage data or program transfers between the host and accelerator, or initiate accelerator startup and shutdown.

All of these details are implicit in the programming model and are managed by the OpenACC API-enabled compilers and runtimes. The programming model allows the programmer to augment information available to the compilers, including specification of data local to an accelerator, guidance on mapping of loops onto an accelerator, and similar performance-related details.

OpenACC in GCC

This page contains information on GCC’s implementation of the OpenACC specification and related functionality.

KernelGen (LLVM)

A prototype of auto-parallelizing Fortran/C compiler for NVIDIA GPUs, targeting numerical modelling code.


OpenAlea is an open source project primarily aimed at the plant research community. It is a distributed collaborative effort to develop Python libraries and tools that address the needs of current and future works in Plant Architecture modeling. OpenAlea includes modules to analyse, visualize and model the functioning and growth of plant architecture.


OpenARC is a new, open source compiler framework, which provides extensible environment, where various performance optimizations, traceability mechanisms, fault tolerance techniques, etc., can be built for better debuggability/performance/resilience on the complex accelerator computing.


Open CASCADE Technology is a software development kit (SDK) intended for development of applications dealing with 3D CAD data, freely available in open source. It includes a set of C++ class libraries providing services for 3D surface and solid modeling, visualization, data exchange and rapid application development.


OpenCL™ is the first open, royalty-free standard for cross-platform, parallel programming of modern processors found in personal computers, servers and handheld/embedded devices. OpenCL (Open Computing Language) greatly improves speed and responsiveness for a wide spectrum of applications in numerous market categories from gaming and entertainment to scientific and medical software.

Intel OpenCL

Intel® Code Builder for OpenCL™ API is a comprehensive environment for OpenCL software development on Intel Architecture processors and Intel Xeon Phi™ coprocessors. The Code Builder comprises the Intel implementation of the OpenCL standard and a set of tools for OpenCL application development on Linux* operating systems.


OpenClimateGIS (OCGIS) is a Python package designed for geospatial manipulation, subsetting, computation, and translation of climate datasets stored in local NetCDF files or files served through THREDDS data servers. OpenClimateGIS has a straightforward, request-based API that is simple to use yet complex enough to perform a variety of computational tasks. The software is built entirely from open source packages. ClimateTranslator is a new web interface to the OpenClimateGIS functionality.


OpenCMISS libraries and applications provide the foundation for developing computational modelling and visualisation software, particularly targeting bioengineering.


OpenCores is an open source hardware community developing digital open source hardware through electronic design automation, with a similar ethos to the free software movement. OpenCores hopes to eliminate redundant design work and slash development costs.


OpenFFT is an open source parallel package for computing three-dimensional Fast Fourier Transforms (3-D FFTs) of both real and complex numbers of arbitrary input size. It originates from OpenMX (Open source package for Material eXplorer). OpenFFT adopts a communication-optimal domain decomposition method that is adaptive and capable of localizing data when transposing from one dimension to another for reducing the total volume of communication. It is written in C and MPI, with support for Fortran through the Fortran interface, and employs FFTW3 for computing 1-D FFTs.


OpenFL is a free and open source software framework and platform for the creation of multi-platform applications and video games. OpenFL programs are written in a single language (Haxe) and may be published to Flash movies, or standalone applications for Microsoft Windows, Mac OS X, Linux, iOS, Android, BlackBerry OS, Firefox OS, HTML5 and Tizen.

OpenFL is designed to mimic Adobe Flash Player, and provides much of the same functionality and API. SWF files created with Adobe Flash Professional or other authoring tools may be used in OpenFL programs.


OpenPKG provides a flexible and extensive toolkit of about 1500 portable and high-quality Unix server software packages within a fully self-contained packaging framework. OpenPKG 4 supports all major Unix server platforms, including BSD, GNU/Linux, Solaris and MacOS X flavors, and can be deployed multiple times on a single system without virtualization technologies and with minimum intrusion. The OpenPKG software distribution is updated daily and hence always provides you with the latest Open Source server software.


Description: OpenSim is a freely available, user extensible software system that lets users develop models of musculoskeletal structures and create dynamic simulations of movement.

OpenSim 3.2 includes an improved scripting interface, accessible through the Graphical User Interface (GUI), Matlab, and now Python. We also added new visualization capabilities and usability improvements in the OpenSim application.


OpenUH is an open source, optimizing compiler suite for C, C++ and Fortran, based on Open64. It supports a variety of architectures including x86-64, IA-32, IA-64, MIPS, and PTX.

OpenUH extends the Open64 OpenMP implementation by adding support for nested parallelism and the tasking features introduced in OpenMP 3.0. The OpenMP runtime library that comes with OpenUH supports several task scheduling strategies, enables selection of more scalable barrier algorithms, and provides an implementation of the OpenMP Collector API for interaction with performance collection tools (including DARWIN). The OpenMP implementation has been successfully tested using a number of applications and validated with the NAS Parallel Benchmarks (NPB) and our OpenMP Validation Suite, developed in collaboration with the High Performance Computing Center Stuttgart (HLRS) from the University of Stuttgart. OpenUH also provides support for Fortran coarrays, an extension that has been adopted in the Fortran 2008 standard. With the use of coarrays, a programmer can easily write parallel Fortran programs for a variety of parallel systems. The OpenUH CAF implementation can work in conjunction with either the GASNet or ARMCI runtime libraries, open-source projects which are freely downloadable online.

To achieve portability, OpenUH is able to emit optimized C or Fortran 77 code that may be compiled by a native compiler on other platforms. The supporting runtime libraries are also portable - the OpenMP runtime library is based on the portable Pthreads interface while the Coarray Fortran runtime library is based on the portable GASNet (or, optionally, ARMCI) communications interfaces.


OpenVZ is container-based virtualization for Linux. OpenVZ creates multiple secure, isolated Linux containers (otherwise known as VEs or VPSs) on a single physical server enabling better server utilization and ensuring that applications do not conflict. Each container performs and executes exactly like a stand-alone server; a container can be rebooted independently and have root access, users, IP addresses, memory, processes, files, applications, system libraries and configuration files.

OpenVZ software consists of an optional custom Linux kernel and command-line tools (mainly vzctl). Our kernel developers work hard to merge containers functionality into the upstream Linux kernel, making OpenVZ team the biggest contributor to Linux Containers (LXC) kernel, with features such as PID and network namespaces, memory controller, checkpoint-restore (see and much more.


Orange File System is a branch of the Parallel Virtual File System. Like PVFS, Orange is a parallel file system designed for use on high end computing (HEC) systems that provides very high performance access to disk storage for parallel applications. OrangeFS is different from PVFS in that we have developed features for OrangeFS that are not presently available in the PVFS main distribution. While PVFS development tends to focus on specific very large systems, Orange considers a number of areas that have not been well supported by PVFS in the past.

OrangeFS is presently integrated with ROMIO through MPICH2, and includes FUSE support. It has also been integrated with pNFS.


ORCM was originally developed as an open-source project (under the Open MPI license) by Cisco Systems, Inc to provide a resilient, 100% uptime run-time environment for enterprise-class routers. Based on the Open Run-Time Environment (ORTE) embedded in Open MPI, the system provided launch and execution support for processes executing within the router itself (e.g., computing routing tables), ensuring that a minimum number of copies of each program were always present. Failed processes were relocated based on the concept of fault groups - i.e., the grouping of nodes with common failure modes. Thus, ORCM attempted to avoid cascade failures by ensuring that processes were not relocated onto nodes with a high probability of failing in the immediate future.

The Cisco implementation naturally required a significant amount of monitoring, and included the notion of fault prediction as a means of taking pre-emptive action to relocate processes prior to their node failing. This was facilitated using an analytics framework that allowed users to chain various analysis modules in the data pipeline so as to perform in-flight data reduction.

Subsequently, ORCM was extended by Greenplum to serve as a scalable monitoring system for Hadoop clusters. While ORCM itself had run on quite a few "nodes" in the Cisco router, and its base ORTE platform has been used for years on very large clusters involving many thousands of nodes, this was the first time the ORCM/ORTE platform had been used solely as a system state-of-health monitor with no responsibility for process launch or monitoring. Instead, ORCM was asked to provide a resilient, scalable monitoring capability that tracked process resource utilization and node state-of-health, collecting all the data in a database for subsequent analysis. Sampling rates were low enough that in-flight data reduction was not required, nor was fault prediction considered to be of value in the Hadoop paradigm.


Ori is a distributed file system built for offline operation and empowers the user with control over synchronization operations and conflict resolution. We provide history through light weight snapshots and allow users to verify the history has not been tampered with. Through the use of replication instances can be resilient and recover damaged data from other nodes.


An open-source extensible framework for the definition of domain-specific languages and generation of optimized (C, Fortran, CUDA, OpenCL) code for multiple architecture targets (e.g., CPUs, NVIDIA and AMD GPUs, Intel Phi), including support for empirical autotuning of the generated code.

Orio is a Python framework for transformation and automatically tuning the performance of codes written in different source and target languages, including transformations from a number of simple languages (e.g., a restricted subset of C) to C, Fortran, CUDA, and OpenCL targets. The tool generates many tuned versions of the same operation using different optimization parameters, and performs an empirical search for selecting the best among multiple optimized code variants.


ownCloud provides access to your data through a web interface or WebDAV while providing a platform to view, sync and share across devices easily—all under your control. ownCloud’s open architecture is extensible via a simple but powerful API for applications and plugins and works with any storage.


OWSLib is a Python package for client programming with Open Geospatial Consortium (OGC) web service (hence OWS) interface standards, and their related content models.


A Python package for arithmetical computations on random variables. The package is capable of performing the four arithmetic operations: addition, subtraction, multiplication and division, as well as computing many standard functions of random variables. Summary statistics, random number generation, plots, and histograms of the resulting distributions can easily be obtained and distribution parameter fitting is also available. The operations are performed numerically and their results interpolated allowing for arbitrary arithmetic operations on random variables following practically any probability distribution encountered in practice. The package is easy to use, as operations on random variables are performed just as they are on standard Python variables. Independence of random variables is, by default, assumed on each step but some computations on dependent random variables are also possible. We demonstrate on several examples that the results are very accurate, often close to machine precision. Practical applications include statistics, physical measurements or estimation of error distributions in scientific computations.


PaPy, which stands for parallel pipelines in Python, is a highly flexible framework that enables the construction of robust, scalable workflows for either generating or processing voluminous datasets. A workflow is created from user-written Python functions (nodes) connected by pipes (edges) into a directed acyclic graph. These functions are arbitrarily definable, and can make use of any Python modules or external binaries. Given a user-defined topology and collection of input data, functions are composed into nested higher-order maps, which are transparently and robustly evaluated in parallel on a single computer or on remote hosts. Local and remote computational resources can be flexibly pooled and assigned to functional nodes, thereby allowing facile load-balancing and pipeline optimization to maximize computational throughput. Input items are processed by nodes in parallel, and traverse the graph in batches of adjustable size — a trade-off between lazy-evaluation, parallelism, and memory consumption. The processing of a single item can be parallelized in a scatter/gather scheme. The simplicity and flexibility of distributed workflows using PaPy bridges the gap between desktop → grid, enabling this new computing paradigm to be leveraged in the processing of large scientific datasets.


A Python CDM for met/ocean data.

Parallel For

A data parallel scientific programming model.

Compiles efficiently to different parallel architectures like distributed memory with message passing [MPI, PVM] and one-sided communication [MPI-2, shmem], shared memory multi-processor and multi-core processors [POSIX threads, OpenMP, boost threads, Intel TBB], procedure off-loading [Nvidia Cuda, Cell BE, AMD Brook, OpenCL], SIMD vectorization [SSE and AltiVec], and sequential C++ code.

Transform inherrent parallel expressions into efficient parallel C++ code.


PARALUTION is a library that enables you to perform various sparse iterative solvers and preconditioners on multi/many-core CPU and GPU devices. Based on C++, it provides a generic and flexible design that allows seamless integration with other scientific software packages.

PARALUTION contains Krylov subspace solvers (CR, CG, BiCGStab, GMRES, IDR), Multigrid (GMG, AMG), Deflated PCG, Fixed-point iteration schemes, Mixed-precision schemes and fine-grained parallel preconditioners based on splitting, ILU factorization with levels, multi-elimination ILU factorization, additive Schwarz and approximate inverse. The library also provides iterative eigenvalue solvers.

The library can be compiled under Linux/Unix-like , Windows and Mac OS. PARALUTION provides multi-core CPU/Host (OpenMP), NVIDIA GPU (CUDA, OpenCL), AMD GPU (OpenCL), Intel Xeon Phi/MIC (OpenCL, OpenMP/offload mode) support, including VS (Visual Studio) gcc (GNU C) and icc (Intel C) compilers.


ParaViewWeb is a collection of components that enables the use of ParaView’s visualization and data analysis capabilities within Web applications. Using the latest HTML 5.0 based technologies, such as WebSocket, and WebGL, ParaViewWeb enables communiation with a ParaView server runnning on a remote visualization node or cluster using a light-weight JavaScript API. Using this API, Web applications can easily embed interactive 3D visualization components. Application developers can write simple Python scripts to extend the server capabilities including creating custom visualization pipelines.


The "Programming with Big Data in R" project (pbdR) enables high-level distributed data parallelism in R, so that it can easily utilize large HPC platforms with thousands of cores, making the R language scale to unparalleled heights. We interpret big data quite literally to mean that its size requires parallel processing either because it does not fit in the memory of a single multicore machine or because we need to make its processing time tolerable.

We achieve this, in part, by providing a simple interface to scalable, high performance libraries, such as MPI, ScaLAPACK, and NetCDF4. The routines in these libraries are engaged through R’s classes and methods, so that the R language syntax is largely preserved, but with new, scalable, compiled code underneath. Most of the cumbersome distributed details are abstracted away for the user, although they are readily accessible should the user desire them.


PCA that iteratively replaces missing data. An implementaton of probabilisitc principal components analysis which is a variant of vanilla PCA that can be used to compute factors where some of the data are missing, and interpolate data by using information from additional series.


PeerVPN is a software that builds virtual ethernet networks between multiple computers. Such a virtual network can be useful to facilitate direct communication that applications like file sharing or gaming may need. Often, such direct communication is made impossible or very difficult by firewalls or NAT devices.

Most traditional VPN solutions follow the client-server principle, which means that all participating nodes connect to a central server. This creates a star topology, which has some disadvantages. The central node needs lots of bandwith, because it needs to handle all the VPN traffic. Also, if the central node goes down, the whole VPN is down too.

A virtual network built by PeerVPN uses a full mesh topology. All nodes talk directly to each other, there is no need for a central server. If one node goes down, the rest of the network is unaffected.


PEGASUS is a Peta-scale graph mining system, fully written in Java. It runs in parallel, distributed manner on top of Hadoop. Hadoop is a cloud computing platfrom, as well as an open source implementation of MapReduce framework which was originally designed for web-scale data processing by Google.

Existing works on graph mining has limited scalability: usually, the maximum graph size is order of millions. PEGASUS breaks the limit by scaling up the algorithms to billion-scale graphs. The breakthrough was possible by the careful algorithm design and implementation for Hadoop, a massive cloud computing platform.


This software framework implements a NURBS-based Galerkin finite element method (FEM), popularly known as isogeometric analysis (IGA). It is heavily based on PETSc, the Portable, Extensible Toolkit for Scientific Computation. PETSc is a collection of algorithms and data structures for the solution of scientific problems, particularly those modeled by partial differential equations (PDEs). PETSc is written to be applicable to a range of problem sizes, including large-scale simulations where high performance parallel is a must. PetIGA can be thought of as an extension of PETSc, which adds the NURBS discretization capability and the integration of forms. The PetIGA framework is intended for researchers in the numeric solution of PDEs who have applications which require extensive computational resources.


Petuum is a distributed machine learning framework. It aims to provide a generic algorithmic and systems interface to large scale machine learning, and takes care of difficult systems "plumbing work" and algorithmic acceleration, while simplifying the distributed implementation of ML programs - allowing you to focus on model perfection and Big Data Analytics. Petuum runs efficiently at scale on research clusters and cloud compute like Amazon EC2 and Google GCE.


The primary goal of the PHAML (Parallel Hierarchical Adaptive MultiLevel method) project is to develop new methods and software for the efficient solution of 2D elliptic partial differential equations (PDEs) on distributed memory parallel computers and multicore computers using adaptive mesh refinement and multigrid solution techniques.


Pharo is a pure object-oriented programming language and a powerful environment, focused on simplicity and immediate feedback (think IDE and OS rolled into one).


PHCpack is a software package to solve polynomial systems by homotopy continuation methods.

A polynomial system is given as a sequence of polynomials in several variables. Homotopy continuation methods operate in two stages. In the first stage, a family of polynomial systems (the so-called homotopy) is constructed. This homotopy contains a polynomial system with known solutions. In the second stage, numerical continuation methods are applied to track the solution paths defined by the homotopy, starting at the known solutions and leading to the solutions of the given polynomial system.


This documentation describes a collection of Python modules to compute solutions of polynomial systems using PHCpack.


A toolbox for developing parallel adaptive finite element programs. PHG deals with conforming tetrahedral meshes and uses bisection for adaptive local mesh refinement and MPI for message passing. PHG has an object oriented design which hides parallelization details and provides common operations on meshes and finite element functions in an abstract way, allowing the users to concentrate on their numerical algorithms.

PHG has a set of rich and easy to use interfaces to other packages, including ParMETIS, PETSc, Hypre, SuperLU, MUMPS, Trilinos, PARPACK, JDBSYM, LOBPCG.


Picat is a simple, and yet powerful, logic-based multi-paradigm programming language aimed for general-purpose applications. Picat is a rule-based language, in which predicates, functions, and actors are defined with pattern-matching rules. Picat incorporates many declarative language features for better productivity of software development, including explicit non-determinism, explicit unification, functions, list comprehensions, constraints, and tabling. Picat also provides imperative language constructs, such as assignments and loops, for programming everyday things. The Picat implementation, which is based on a well-designed virtual machine and incorporates a memory manager that garbage-collects and expands the stacks and data areas when needed, is efficient and scalable. Picat can be used for not only symbolic computations, which is a traditional application domain of declarative languages, but also for scripting and modeling tasks.


The Portable Data-Parallel Visualization and Analysis (also referred to as PISTON) is a cross-platform software library providing frequently used operations for scientific visualization and data analysis. The algorithms for these operations are specified in a data-parallel way. By using nVidia’s freely downloadable Thrust library and our own tools, we can generate executable codes for different acceleration hardware architectures (GPUSs and multi-core CPUs) from a single version of source code. The library is designed to be extensible and is intended to be integrated into other visualization applications.

Platform MPI

IBM® Platform MPI Community Edition is a no-charge community edition of IBM Platform MPI supporting the core MPI features. It is available for download, deployment, and redistribution at no charge. This edition is simple, flexible, powerful, and reliable; easy to install, embed, deploy; embodies core capabilities of Platform MPI for Linux® and Windows®; and provides an optional low cost offering that includes higher rank counts, 24/7 IBM customer support, fix packs, and upgrade protection.


A simple, accessible HTML5 media player.


The Process Management Interface (PMI) has been used for quite some time as a means of exchanging wireup information needed for interprocess communication. Two versions (PMI-1 and PMI-2) have been released as part of the MPICH effort. While PMI-2 demonstrates better scaling properties than its PMI-1 predecessor, attaining rapid launch and wireup of the roughly 1M processes executing across 100k nodes expected for exascale operations remains challenging.

PMI Exascale (PMIx) represents an attempt to resolve these questions by providing an extended version of the PMI standard specifically designed to support clusters up to and including exascale sizes. The overall objective of the project is not to branch the existing pseudo-standard definitions - in fact, PMIx fully supports both of the existing PMI-1 and PMI-2 APIs - but rather to (a) augment and extend those APIs to eliminate some current restrictions that impact scalability, and (b) provide a reference implementation of the PMI-server that demonstrates the desired level of scalability.


Portable Computing Language (pocl) aims to become a MIT-licensed open source implementation of the OpenCL standard which can be easily adapted for new targets and devices, both for homogeneous CPU and heterogenous GPUs/accelerators.

pocl uses Clang as an OpenCL C frontend and LLVM for the kernel compiler implementation, and as a portability layer. Thus, if your desired target has an LLVM backend, it should be able to get OpenCL support easily by using pocl.

The goal is to accomplish improved performance portability using a kernel compiler that can generate multi-work-item work-group functions that exploit various types of parallel hardware resources: VLIW, superscalar, SIMD, SIMT, multicore, multithread …​

Additional purpose of the project is to serve as a research platform for issues in parallel programming on heterogeneous platforms.


The Poincaré code is a Maple project package that aims to gather significant computer algebra normal form (and subsequent reduction) methods for handling nonlinear ordinary differential equations. As a first version, a set of fourteen easy-to-use Maple commands is introduced for symbolic creation of (improved variants of Poincaré’s) normal forms as well as their associated normalizing transformations. The software is the implementation by the authors of carefully studied and followed up selected normal form procedures from the literature, including some authors’ contributions to the subject. As can be seen, joint-normal-form programs involving Lie-point symmetries are of special interest and are published in CPC Program Library for the first time, Hamiltonian variants being also very useful as they lead to encouraging results when applied, for example, to models from computational physics like Hénon–Heiles.


Polly is a high-level loop and data-locality optimizer and optimization infrastructure for LLVM. It uses an abstract mathematical representation based on integer polyhedra to analyze and optimize the memory access pattern of a program. We currently perform classical loop transformations, especially tiling and loop fusion to improve data-locality. Polly can also exploit OpenMP level parallelism, expose SIMDization opportunities. Work has also be done in the area of automatic GPU code generation.


Pomegranate is an open source Python application that implements the open Webification (w10n) Science API for major scientific data stores (HDF, NetCDF, etc.). It makes file inner components, such attributes and data arrays, directly addressable and accessible via well-defined and meaningful URLs.

Data exposed by w10n-sci API is readily consumable by any HTTP client. It can be as simple as a command line like curl or wget, or as advanced as a full-fledged HTML5 web application such as REX.

Pomegranate has been included in Taiga, a turnkey software tool that simplifies the use of scientific data.

It can be installed as a command line tool and/or a ReSTful web service.

Source code is available at Open Channel Software. However, please note that Pomegranate alone won’t be enough to establish a w10n-sci service. What you really need is this instruction service-setup.txt, that details the steps necessary to build, install and configure for a complete service. Or rather use a turnkey solution like Taiga, so that you can be up and running in minutes.


We investigate performance improvements for the discrete element method (DEM) used in ppohDEM. First, we use OpenMP and MPI to parallelize DEM for efficient operation on many types of memory, including shared memory, and at any scale, from small PC clusters to supercomputers. We also describe a new algorithm for the descending storage method (DSM) based on a sort technique that makes creation of contact candidate pair lists more efficient. Finally, we measure the performance of ppohDEM using the proposed improvements, and confirm that computational time is significantly reduced. We also show that the parallel performance of ppohDEM can be improved by reducing the number of OpenMP threads per MPI process.


Describe your software project just once, using Premake’s simple and easy to read syntax, and build it everywhere.

Generate project files for Visual Studio, GNU Make, Xcode, Code:Blocks, and more across Windows, Mac OS X, and Linux. Use the full featured Lua scripting engine to make build configuration tasks a breeze.


PRIMME is a C library to find a number of eigenvalues and their corresponding eigenvectors of a Real Symmetric, or Complex Hermitian matrix A. Symmetric and Hermitian eigenvalue problems enjoy a remarkable theoretical structure that allows for efficient and stable algorithms for obtaining a few required eigenpairs. This is probably one of the reasons that enabled applications requiring the solution of symmetric eigenproblems to push their accuracy and thus computational demands to unprecedented levels. Materials science, structural engineering, and some QCD applications routinely compute eigenvalues of matrices of dimension more than a million; and often much more than that! Typically, with increasing dimension comes increased ill conditioning, and thus the use of preconditioning becomes essential.[]


A programming language, development environment, and online community. Since 2001, Processing has promoted software literacy within the visual arts and visual literacy within technology. Initially created to serve as a software sketchbook and to teach computer programming fundamentals within a visual context, Processing evolved into a development tool for professionals.

Processing continues to be an alternative to proprietary software tools with restrictive and expensive licenses, making it accessible to schools and individual students. Its open source status encourages the community participation and collaboration that is vital to Processing’s growth. Contributors share programs, contribute code, and build libraries, tools, and modes to extend the possibilities of the software. The Processing community has written more than a hundred libraries to facilitate computer vision, data visualization, music composition, networking, 3D file exporting, and programming electronics.


A collection of classes that performs the heavy lifting for you by writing a minimal amount of code. This library is compatible with both Processing and Processing.js


Processing.js is the sister project of the popular Processing visual programming language, designed for the web. Processing.js makes your data visualizations, digital art, interactive animations, educational graphs, video games, etc. work using web standards and without any plug-ins. You write code using the Processing language, include it in your web page, and Processing.js does the rest.


This project provides a Python package that creates an environment for graphics applications that closely resembles that of the Processing system. The project mission is to implement Processing’s friendly graphics functions and interaction model in Python. Not all of Processing is to be ported, though, since Python itself already provides alternatives for many features of Processing, such as XML parsing. The pyprocessing backend is built upon OpenGL and Pyglet, which provide the actual graphics rendering. Since these are multiplatform, so is pyprocessing.


We describe an implementation to solve Poissonʼs equation for an isolated system on a unigrid mesh using FFTs. The method solves the equation globally on mesh blocks distributed across multiple processes on a distributed-memory parallel computer. Test results to demonstrate the convergence and scaling properties of the implementation are presented. The solver is offered to interested users as the library PSPFFT.


The Parallel Ultra-Light Systolic Array Runtime (PULSAR), now in version 2.0, is a complete programming platform for large-scale distributed memory systems with multicore processors and hardware accelerators. PULSAR provides a simple abstraction layer over multithreading, message-passing, and multi-GPU, multi-stream programming. PULSAR offers a general-purpose programming model, suitable for a wide range of scientific and engineering applications.

This simple programming model allows the user to define the computation in the form of a Virtual Systolic Array (VSA), which is a set of Virtual Data Processors (VDPs), and is connected with data channels. This programming model is also accessible to the user through a very small and simple Application Programming Interface (API), and all the complexity of executing the workload on a large-scale system is hidden in the runtime implementation.

The runtime supports distributed memory systems with multicore processors and relies on POSIX Threads (a.k.a. Pthreads) for intra-node multithreading, and on the Message Passing Interface (MPI) for inter-node communication. The runtime also supports multiple Nvidia GPU accelerators, in each distributed memory node, using the Compute Unified Device Architecture (CUDA) platform.

Pure Data

Pure Data (aka Pd) is an open source visual programming language. Pd enables musicians, visual artists, performers, researchers, and developers to create software graphically, without writing lines of code. Pd is used to process and generate sound, video, 2D/3D graphics, and interface sensors, input devices, and MIDI. Pd can easily work over local and remote networks to integrate wearable technology, motor systems, lighting rigs, and other equipment. Pd is suitable for learning basic multimedia processing and visual programming methods as well as for realizing complex systems for large-scale projects.

Pd is a so-called data flow programming language, where software called patches are developed graphically. Algorithmic functions are represented by objects, placed on a screen called canvas. Objects are connected together with cords, and data flows from one object to another through this cords. Each object performs a specific task, from very low level mathematic operations to complex audio or video functions such as reverberation, fft transform, or video decoding.


Linux-centric monolithic distribution based on pd-extended with focus on solid/stable core, enhancements, and usability features including infinite undo, gui-based iemgui object editing, accelerated visual editor and gui operations, improved appearance, K12 education mode, and more. The distribution is developed for and maintained by Virginia Tech’s Linux Laptop Orchestra (L2Ork).


Pushpin is a new way to build realtime HTTP and WebSocket services.


A collection of standard atmospheric and oceanic sciences routines.


The Python ARM Radar Toolkit, Py-ART, is an open source Python module containing a growing collection of weather radar algorithms and utilities build on top of the Scientific Python stack and distributed under the 3-Clause BSD license. Py-ART is used by the Atmospheric Radiation Measurement (ARM) Climate Research Facility for working with data from a number of precipitation and cloud radars, but has been designed so that it can be used by others in the radar and atmospheric communities to examine, processes, and analyse data from many types of weather radars.


A library which follows the Python/C API as closely as possible, while providing equivalent functionality for objective caml. This is built against python 2.x and Ocaml 3.04.

It is intended to allow users to build native ocaml libraries and use them from python, and alternately, in order to allow ocaml users to benefit from linkable libraries provided for python.


pyDatalog adds the logic programming paradigm to Python’s extensive toolbox, in a pythonic way.

Logic programmers can now use the extensive standard library of Python, and Python programmers can now express complex algorithms quickly.

Datalog is a truly declarative language derived from Prolog, with strong academic foundations. Datalog excels at managing complexity. Datalog programs are shorter than their Python equivalent, and Datalog statements can be specified in any order, as simply as formula in a spreadsheet.


PyDom is a Python package which implements various diagnostics for NEMO model output.


PyFR is an open-source Python based framework for solving advection-diffusion type problems on streaming architectures using the Flux Reconstruction approach of Huynh. The framework is designed to solve a range of governing systems on mixed unstructured grids containing various element types. It is also designed to target a range of hardware platforms via use of an in-built domain specific language derived from the Mako templating engine.


A cross-platform windowing and multimedia library for Python. Pyglet provides an object-oriented programming interface for developing games and other visually-rich applications for Windows, Mac OS X and Linux. Features include:

  • No external dependencies or installation requirements. For most application and game requirements, pyglet needs nothing else besides Python, simplifying distribution and installation.

  • Take advantage of multiple windows and multi-monitor desktops. pyglet allows you to use as many windows as you need, and is fully aware of multi-monitor setups for use with fullscreen games.

  • Load images, sound, music and video in almost any format. pyglet can optionally use AVbin to play back audio formats such as MP3, OGG/Vorbis and WMA, and video formats such as DivX, MPEG-2, H.264, WMV and Xvid.


Utilities for applying scikit-learn to spatial datasets.


pyKML is a Python package for creating, parsing, manipulating, and validating KML, a language for encoding and annotating geographic data.

pyKML is based on the lxml.objectify API which provides a Pythonic API for working with XML documents. pyKML adds additional functionality specific to the KML language.

KML comes in several flavors. pyKML can be used with KML documents that follow the base OGC KML specification, the Google Extensions Namespace, or a user-supplied extension to the base KML specification (defined by an XML Schema document).


A machine learning research library based on Theano.


Pynamic is a benchmark designed to test a system’s ability to handle the Dynamic Linking and Loading requirements of Python-based scientific applications. We developed this benchmark to represent a newly emerging class of DLL behaviors. Pynamic builds on pyMPI, an MPI extension to Python. Our augmentation includes a code generator that automatically generates Python C-extension dummy codes and a glue layer that facilitates linking and loading of the generated dynamic modules into the resulting pyMPI. Pynamic is configurable, enabling it to model the static properties of a specific code. It does not, however, model any significant computations of the target and hence it is not subjected to the same level of control as the target code. In fact, we encourage HPC computer vendors and tool developers to add it to their test suites. This benchmark provides an effective test of the compiler, the linker, the loader, the OS kernel and other runtime systems of a high performance computing (HPC) system to handle an important aspect of modern scientific computing applications. In addition, the benchmark serves as a stress test case for code development tools. Although Python has recently gained popularity in the HPC community, its heavy use of DLL operations has hindered certain HPC code development tools, notably parallel debuggers, from performing optimally.

The heart of Pynamic is a Python script that generates C files and compiles them into shared object libraries. Each library contains a Python callable entry function as well as a number of utility functions. The user can also enable cross library function calls with a command line argument. The Pynamic configure script then links these libraries into the pynamic-pyMPI executable and creates a driver script to exercise the functions in the generated libraries. The user can specify the number of libraries to create, as well as the average number of utility functions per library, thus tailoring the benchmark to match some application of interest. Pynamic introduces randomness in the number of functions per module and the function signatures, thus ensuring some heterogeneity of the libraries and functions.


A Python remote procedure call framework that uses JSON RPC v2.0. Python-JRPC allows programmers to create powerful client/server programs with very little code.


PyPy is a fast, compliant alternative implementation of the Python language (2.7.8 and 3.2.5).

PyPy is a replacement for CPython. It is built using the RPython language that was co-developed with it. The main reason to use it instead of CPython is speed: it runs generally faster. PyPy 2.5 implements Python 2.7.8 and runs on Intel x86 (IA-32) , x86_64 and ARM platforms, with PPC being stalled. It supports all of the core language, passing the Python test suite (with minor modifications that were already accepted in the main python in newer versions). It supports most of the commonly used Python standard library modules.


PyRDM is a Python-based library for research data management (RDM). It facilitates the automated publication of scientific software and associated input and output data.


PySPH is an open source framework for Smoothed Particle Hydrodynamics (SPH) simulations. It is implemented in Python and the performance critical parts are implemented in Cython.

PySPH is implemented in a way that allows a user to specify the entire SPH simulation in pure Python. High-performance code is generated from this high-level Python code, compiled on the fly and executed. PySPH also features optional automatic parallelization using mpi4py and Zoltan. If you wish to use the parallel capabilities you will need to have these installed.


Python-nvd3 is a wrapper for NVD3 graph library. NVD3 is an attempt to build re-usable charts and chart components for d3.js without taking away the power that d3.js offers you. Python-NVD3 makes your life easy! You write Python and the library renders JavaScript for you.

Python Packaging User Guide

The “Python Packaging User Guide” (PPUG) aims to be the authoritative resource on how to package and install Python distributions using current tools, but also on the efforts to improve Python packaging. The guide is part of a larger effort to improve all of the packaging and installation docs, including pip, setuptools, virtualenv, wheel, distlib, and


The UNIfied COmplex network and RecurreNce analysis toolbox) is a fully object-oriented python package for the advanced analysis and modeling of complex networks. Above the standard measures of complex network theory such as degree, betweenness and clustering coefficient it provides some uncommon but interesting statistics like Newman’s random walk betweenness.pyunicorn features novel node-weighted (node splitting invariant) network statistics as well as measures designed for analyzing networks of interacting/interdependent networks.

Moreover, pyunicorn allows to easily construct networks from uni- and multivariate time series data (functional (climate) networks and recurrence networks). This involves linear and nonlinear measures of time series analysis for constructing functional networks from multivariate data as well as modern techniques of nonlinear analysis of single time series like recurrence quantification analysis (RQA) and recurrence network analysis.


QEMU is a generic and open source machine emulator and virtualizer.

When used as a machine emulator, QEMU can run OSes and programs made for one machine (e.g. an ARM board) on a different machine (e.g. your own PC). By using dynamic translation, it achieves very good performance.

When used as a virtualizer, QEMU achieves near native performances by executing the guest code directly on the host CPU. QEMU supports virtualization when executing under the Xen hypervisor or using the KVM kernel module in Linux. When using KVM, QEMU can virtualize x86, server and embedded PowerPC, and S390 guests.


QUARK (QUeuing And Runtime for Kernels) provides a library that enables the dynamic, superscalar execution of tasks with data dependencies in a multi-core, multi-socket, shared-memory environment. QUARK infers data dependencies and precedence constraints between tasks based on the way the data is used, and then executes the tasks in a dynamic, asynchronous, superscalar fashion in order to achieve a high utilization of the available resources.

QUARK is designed to be easy to use, scales to large numbers of cores, and enables the efficient expression and implementation of complex algorithms. The QUARK runtime is codesigned with the PLASMA linear algebra library, and it contains optimizations inspired by the algorithms in PLASMA.

The R Package


This package carries out empirical mode decomposition and Hilbert spectral analysis.


Principal component analysis (PCA) is widely used to analyze high-dimensional data, but it is sensitive to outliers. Robust PCA methods seek fits that are unaffected by the outliers and can therefore be trusted to reveal them. FastHCS is a robust PCA algorithm that adapts the FastPCS estimator of location and scatter to the high-dimensional setting, including cases where the number of variables exceeds the number of observations. After detailing the FastHCS algorithm, we carry out an extensive simulation study and four real data applications, the results of which show that FastHCS is systematically more robust to outliers than its competitors.


R package implementing multitaper spectral estimation techniques used in time series analysis. This version may be slightly more updated than the one on CRAN.


This package provides a framework to perform Non-negative Matrix Factorization (NMF). It implements a set of already published algorithms and seeding methods, and provides a framework to test, develop and plug new/custom algorithms. Most of the built-in algorithms have been optimized in C++, and the main interface function provides an easy way of performing parallel computations on multicore machines.


Supports the analysis of Oceanographic data, including ADP measurements, CTD measurements, sectional data, sea-level time series, coastline files, etc. Provides functions for calculating seawater properties such as potential temperature and density, as well as derived properties such as buoyancy frequency and dynamic height.


Functions for transforming and viewing 2-D and 3-D (oceanographic) data and model output.


OpenCPU is a system for embedded scientific computing and reproducible research. The OpenCPU server provides a reliable and interoperable HTTP API for data analysis based on R. You can either use the public servers or host your own. The OpenCPU JavaScript client library provides the most seamless integration of R and JavaScript available today. Enjoy simple RPC and data I/O through standard Ajax techniques. No need to learn crazy widgets or obscure framworks. The OpenCPU API is a clean and simple interface to R, nothing more nothing less. It is compatible with any language or framework that speaks HTTP.


A suite of functions for converting sp-class objects into KML or KMZ documents for use in Google Earth. Visualization of spatial and spatio-temporal objects in Google Earth


A multiway method to decompose a tensor (array) of any order, as a generalisation of SVD also supporting non-identity metrics and penalisations. 2-way SVD with these extensions is also available. The package includes also some other multiway methods: PCAn (Tucker-n) and PARAFAC/CANDECOMP with these extensions.


This package builds on the EMD package to provide additional tools for empirical mode decomposition (EMD) and Hilbert spectral analysis. It also implements the ensemble empirical decomposition (EEMD) and the complete ensemble empirical mode decomposition (CEEMD) methods to avoid mode mixing and intermittency problems found in EMD analysis. The package comes with several plotting methods that can be used to view intrinsic mode functions, the HHT spectrum, and the Fourier spectrum. To see the version history and download the bleeding-edge version (at your own risk!), see the project website at below. See the other links for PDF files describing numerical and exact analytical methods for determining instantaneous frequency, some examples of signals processed with this package, and some examples of the ensemble empirical mode decomposition method.


An R interface for C library libeemd for performing the ensemble empirical mode decomposition (EEMD), its complete variant (CEEMDAN) or the regular empirical mode decomposition (EMD).


Rserve is a TCP/IP server which allows other programs to use facilities of R (see from various languages without the need to initialize R or link against R library. Every connection has a separate workspace and working directory. Client-side implementations are available for popular languages such as C/C++, PHP and Java. Rserve supports remote connection, authentication and file transfer. Typical use is to integrate R backend for computation of statstical models, plots etc. in other applications.


Shiny makes it incredibly easy to build interactive web applications with R. Automatic "reactive" binding between inputs and outputs and extensive pre-built widgets make it possible to build beautiful, responsive, and powerful applications with minimal effort.


Discrete Prolate Spheroidal Sequence (Slepian) Regression Smoothers.


Package for discrete Morse-Smale complex approximation based on kNN graph. The Morse-Smale complex provides a decomposition of the domain. This package provides methods to compute a hierarchical sequence of Morse-Smale complicies and tools that exploit this domain decomposition for regression and visualization of scalar functions.


This package provides regularized principal component analysis incorporating smoothness, sparseness and orthogonality of eigenfunctions by using alternating direction method of multipliers (ADMM) algorithm.


This package contains a set of measures of dissimilarity between time series to perform time series clustering. Metrics based on raw data, on generating models and on the forecast behavior are implemented. Some additional utilities related to time series clustering are also provided, such as clustering algorithms and cluster evaluation metrics.


The W2CWM2C package is a set of functions to produce new graphical tools for wavelet correlation (bivariate and multivariate cases) using some routines from the waveslim and wavemulcor packages.


Wavelet analysis and reconstruction of time series, cross-wavelets and phase-difference (with filtering options), significance with simulation algorithms.


Basic wavelet routines for time series (1D), image (2D) and array (3D) analysis. The code provided here is based on wavelet methodology developed in Percival and Walden (2000); Gencay, Selcuk and Whitcher (2001); the dual-tree complex wavelet transform (DTCWT) from Kingsbury (1999, 2001) as implemented by Selesnick; and Hilbert wavelet pairs (Selesnick 2001, 2002).


Methods to calculate and interpret climate change signals and time series from climate multi-model ensembles. Climate model output in binary NetCDF format is read in and aggregated over a specified region to a data.frame for statistical analysis. Global circulation models (GCMs), as the CMIP5 or CMIP3 simulations, can be read in the same way as Regional Climate Models (RCMs), as e.g. the CORDEX or ENSEMBLES simulations.


Simulation and Inference for Stochastic Differential Equations. The YUIMA Project is an open source and collaborative effort aimed at developing the R package yuima for simulation and inference of stochastic differential equations. In the yuima package stochastic differential equations can be of very abstract type, multidimensional, driven by Wiener process or fractional Brownian motion with general Hurst parameter, with or without jumps specified as Lévy noise. The yuima package is intended to offer the basic infrastructure on which complex models and inference procedures can be built on. This paper explains the design of the yuima package and provides some examples of applications.


The Rapid Python Deep Learning Infrastructure (RaPyDLI) project is based on the objective to combine high level Python, C/C++ and Java environments with carefully designed libraries supporting GPU accelerators and MIC coprocessors. Interactive analysis and visualization will be supported together with scaling from the current terabyte size to Petabyte datasets to enable substantial progress in the complexity and capability of the DL applications. A broad range of storage models will be supported including network file systems, databases and HDFS. The partnership of Indiana University, University of Tennessee Knoxville, and Stanford University combines leaders in parallel computing algorithms and run times, Big Data, clouds, and deep learning.


Array Databases allow storing and querying massive multi-dimensional arrays, such as sensor, image, simulation, and statistics data appearing in domains like earth, space, and life science.

The rasdaman ("raster data manager") is the leading array analytics engine distinguished by its flexibility, performance, and scalability. Rasdaman embeds itself smoothly into PostgreSQL, but can also run standalone on file systems. In fact, rasdaman has pioneered Array Databases being the first fully implemented, operationally used system with an array query language and optimized processing engine; known rasdaman databases exceed 230 TB.


The petascope component of rasdaman implements the OGC interface standards WCS 2.0, WCS-T 1.4, WCPS 1.0, WPS 1.0, and WMS 1.1. For this purpose, petascope maintains its additional metadata (such as georeferencing) which is kept in separate relational tables. Note that not all rasdaman raster objects and collections are available through petascope by default; rather, they need to be registered through the petascope administration interface.

Petascope is implemented as a war file of servlets which give access to coverages (in the OGC sense) stored in rasdaman. Internally, incoming requests requiring coverage evaluation are translated into rasql queries by petascope. These queries are passed on to rasdaman, which constitutes the central workhorse. Results returned from rasdaman are forwarded to the client, finally.


Clean and fast and geospatial raster I/O for Python programmers who use Numpy. Rasterio employs GDAL under the hood for file I/O and raster formatting. Its functions typically accept and return Numpy ndarrays. Rasterio is designed to make working with geospatial raster data more productive and more fun.

Raster Numpy Basics

IPython notebook tutorial on nbviewer.


Recki-CT is a set of tools that implement a PHP compiler, in PHP. It doesn’t provide a VM, so it can’t run PHP by itself. However, it can parse PHP code and generate other code from it. Recki uses the well-known PHP-Parser library to generate a graph-based representation of the code, and convert it to an intermediate representation. This intermediate form is pretty low-level, and it is comparatively simple to generate code from it for a variety of targets. One of the targets Recki can use is a second component, JitFu, which is a PHP extension allowing us to generate machine code at run time.


JIT-Fu is a PHP extension that exposes an OO API for the creation of native instructions to PHP userland, using libjit.


LibJIT is a library that provides generic Just-In-Time compiler functionality independent of any particular bytecode, language, or runtime. The goal of the libjit project is to provide an extensive set of routines that takes care of the bulk of the JIT process, without tying the programmer down with language specifics. Where we provide support for common object models, we do so strictly in add-on libraries, not as part of the core code.


Redis is an open source, BSD licensed, advanced key-value cache and store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets, sorted sets, bitmaps and hyperloglogs.


A vision of heterogeneous computer systems that incorporate diverse accelerators and automatically select the best computational unit for a particular task is widely shared among researchers and many industry analysts; however, there are no agreed-upon benchmarks to support the research needed in the development of such a platform. There are many suites for parallel computing on general-purpose CPU architectures, but accelerators fall into a gap that is not covered by previous benchmark development. Rodinia is released to address this concern.[]


The Robot Operating System (ROS) is a flexible framework for writing robot software. It is a collection of tools, libraries, and conventions that aim to simplify the task of creating complex and robust robot behavior across a wide variety of robotic platforms.


This professional scientific software computes recurrence plots, cross recurrence plots, joint recurrence plots and recurrence quantification analysis on commandline of Unix and DOS/DOS-emulated systems. It is able to work with really long data series. However, the output of the results (plots) have to be prepared with external programmes (e.g. gnuplot or Matlab).

The state space trajectory can be reconstructed from single time-series by time-delay embedding. Alternatively, the columns of input data can be used as the components of the state space vectors.


RPerl is an upgrade to the popular Perl 5 programming language. RPerl gives software developers a compiler to make their apps run really fast on parallel computing platforms like multi-core processors, the cloud, clusters, and supercomputers. RPerl stands for Restricted Perl, in that we restrict our use of Perl to those parts which can be made to run fast.

The input to the RPerl compiler is low-magic Perl 5 source code. RPerl converts the low-magic Perl 5 source code into C source code using Perl and/or C data structures. Inline::CPP converts the C source code into XS source code. Perl’s XS tools and a standard C compiler convert the XS source code into machine-readable binary code, which can be directly linked back into normal high-magic Perl 5 source code.

The output of the RPerl compiler is fast-running binary code that is exactly equivalent to, and compatible with, the original low-magic Perl 5 source code input. The net effect is that RPerl compiles slow low-magic Perl 5 code into fast binary code, which can optionally be mixed back into high-magic Perl apps.


This document describes an implementation in C of a set of randomized algorithms for computing partial Singular Value Decompositions (SVDs). The techniques largely follow the prescriptions in the article "Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions," N. Halko, P.G. Martinsson, J. Tropp, SIAM Review, 53(2), 2011, pp. 217-288, but with some modifications to improve performance. The codes implement a number of low rank SVD computing routines for three different sets of hardware: (1) single core CPU, (2) multi core CPU, and (3) massively multicore GPU.




Prawn is a nimble PDF writer for Ruby. More important, it’s a hackable platform that offers both high level APIs for the most common needs and low level APIs for bending the document model to accomodate special circumstances.

With Prawn, you can write text, draw lines and shapes and place images anywhere on the page and add as much color as you like. In addition, it brings a fluent API and aggressive code re-use to the printable document space.


Reactive extensions to Python.

The Reactive Extensions for Python (RxPY) is a set of libraries for composing asynchronous and event-based programs using observable sequences and LINQ-style query operators in Python. Using Rx, developers represent asynchronous data streams with Observables, query asynchronous data streams using LINQ operators, and parameterize the concurrency in the asynchronous data streams using Schedulers. Simply put, Rx = Observables + LINQ


SageManifolds is a package under development for the modern computer algebra system Sage, implementing differential geometry and tensor calculus.

SageManifolds deals with real differentiable manifolds of arbitrary dimension. The basic objects are tensor fields and not tensor components in a given vector frame or coordinate chart. In other words, various charts and frames can be introduced on the manifold and a given tensor field can have representations in each of them.

An important class of treated manifolds is that of pseudo-Riemannian manifolds, among which Riemannian manifolds and Lorentzian manifolds, with applications to General Relativity. In particular, SageManifolds implements the computation of the Riemann curvature tensor and associated objects (Ricci tensor, Weyl tensor). SageManifolds can also deal with generic affine connections, not necessarily Levi-Civita ones.

The SageManifolds project aims at extending the mathematics software system Sage towards differential geometry and tensor calculus. Like Sage, SageManifolds is free, open-source and is based on the Python programming language. We discuss here some details of the implementation, which relies on Sage’s parent/element framework, and present a concrete example of use.


Sailfish is an open source (LGPL) fluid dynamics solver based on the lattice Boltzmann method (LBM). It is uses run-time code generation techniques to automatically generate optimized, simulation specific code for GPU devices (both CUDA and OpenCL targets are supported). Documentation


The Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory is developing algorithms and software technology to enable the application of structured adaptive mesh refinement (SAMR) to large-scale multi-physics problems relevant to U.S. Department of Energy programs.

SAMRAI (Structured Adaptive Mesh Refinement Application Infrastructure) is an object-oriented C++ software library enables exploration of numerical, algorithmic, parallel computing, and software issues associated with applying structured adaptive mesh refinement (SAMR) technology in large-scale parallel application development. SAMRAI provides software tools for developing SAMR applications that involve coupled physics models, sophisticated numerical solution methods, and which require high-performance parallel computing hardware. SAMRAI enables integration of SAMR technology into existing codes and simplifies the exploration of SAMR methods in new application domains. Due to judicious application of object-oriented design, SAMRAI capabilities are readily enhanced and extended to meet specific problem requirements. The SAMRAI team collaborates with application researchers at LLNL and other institutions. These interactions motivate the continued evolution of the SAMRAI library.


An object-functional programming language for general software applications. Scala has full support for functional programming and a very strong static type system. This allows programs written in Scala to be very concise and thus smaller in size than other general-purpose programming languages. Many of Scala’s design decisions were inspired by criticism over the shortcomings of Java.

Scala source code is intended to be compiled to Java bytecode, so that the resulting executable code runs on a Java virtual machine. Java libraries may be used directly in Scala code and vice versa (Language interoperability).[8] Like Java, Scala is object-oriented, and uses a curly-brace syntax reminiscent of the C programming language. Unlike Java, Scala has many features of functional programming languages like Scheme, Standard ML and Haskell, including currying, type inference, immutability, lazy evaluation, and pattern matching. It also has an advanced type system supporting algebraic data types, covariance and contravariance, higher-order types, and anonymous types. Other features of Scala not present in Java include operator overloading, optional parameters, named parameters, raw strings, and no checked exceptions.


Saddle is a data manipulation library for Scala that provides array-backed, indexed, one- and two-dimensional data structures that are judiciously specialized on JVM primitives to avoid the overhead of boxing and unboxing.

Saddle offers vectorized numerical calculations, automatic alignment of data along indices, robustness to missing (N/A) values, and facilities for I/O.

Saddle draws inspiration from several sources, among them the R programming language & statistical environment, the numpy and pandas Python libraries, and the Scala collections library.


A Scala to JavaScript compiler. Scala.js compiles Scala code to JavaScript, allowing you to write your web application entirely in Scala.


The ScalaLab project aims to provide an efficient scientific programming environment for the Java Virtual Machine. The scripting language is based on the Scala programming language enhanced with high level scientific operators and with an integrated environment that provides a MATLAB-like working style. Also, all the huge libraries of Java scientific code can be easily accessible (and many times with a more convenient syntax). The main potential of the ScalaLab is numerical code speed and flexibility. The statically typed Scala language can provide speeds of scripting code similar to pure Java. A major design priority of ScalaLab is its user-friendly interface. We like the user to enjoy writing scientific code, and with this objective we design the whole framework.


A suite of machine learning and numerical computing libraries. ScalaNLP is the umbrella project for several libraries, including Breeze and Epic. Breeze is a set of libraries for machine learning and numerical computing. Epic is a high-performance statistical parser and structured prediction library.


ScMathML is a Scala library for executing Content MathML. Content MathML is a move towards a standard, open format for representing mathematics with relatively well defined semantics. ScMathML takes formulas, and evaluates them in a Context, which provides access to domain objects, constants etc.


A Scala project that harvests sensor data from web sources. The data is then pushed to an SOS using the sos-injection module project. SosInjector is a project that wraps an Sensor Observation Service (SOS). The sos-injection module provides Java classes to enter stations, sensors, and observations into an SOS.

sensor-web-harvester is used to fill an SOS with observations from many well-known sensor sources (such as NOAA and NERRS). This project pulls sensor observation values from the source’s stations. It then formats the data to be placed into the user’s SOS by using the sos-injector. The source stations used are filtered by a chosen bounding box area.


Schur is a stand alone C program for interactively calculating properties of Lie groups and symmetric functions. Schur has been designed to answer questions of relevance to a wide range of problems of special interest to chemists, mathematicians and physicists - particularly for persons who need specific knowledge relating to some aspect of Lie groups or symmetric functions and yet do not wish to be encumbered with complex algorithms. The objective of Schur is to supply results with the complexity of the algorithms hidden from view so that the user can effectively use Schur as a scratch pad, obtaining a result and then using that result to derive new results in a fully interactive manner. Schur can be used as a tool for calculating branching rules, Kronecker products, Casimir invariants, dimensions, plethysms, S-function operations, Young diagrams and their hook lengths etc.

As well as being a research tool Schur forms an excellent tool for helping students to independently explore the properties of Lie groups and symmetric functions and to test their understanding by creating simple examples and moving on to more complex examples. The user has at his or her disposal over 160 commands which may be nested to give a vast variety of potential operations. Every command, with examples, is described in a 200 page manual. Attention has been given to input/output issues to simplify input and to give a well organized output. The output may be obtained in TeX form if desired. Log files may be created for subsequent editing. On line help files may be brought to screen at any time.


ScientiFig is a free tool to help you create, format or reformat scientific figures.


The SciRuby Project aims to provide Ruby with scientific capabilities similar to what the wonderful NumPy and SciPy libraries bring to Python. Our goal is to provide a complete suite of statistical, numerical, and visualization software tools for scientific computing.


SCIRun is a problem solving environment or "computational workbench" in which a user selects software modules that can be connected in a visual programing environment to create a high level workflow for experimentation. Each module exposes all the available parameters necessary for scientists to adjust the outcome of their simulation or visualization. The networks in SCIRun are flexible enough to enable duplication of networks and creation of new modules.

Many SCIRun users find this software particularly useful for their bioelectric field research. Their topics of investigation include cardiac electro-mechanical simulation, ECG and EEG forward and inverse calculations, modeling of deep brain stimulation, electromyography calculation, and determination of the electrical conductivity of anisotropic heart tissue. Users have also made use of SCIRun for the visualization of breast tumor brachytherapy, computer aided surgery, teaching, and a number of non-biomedical applications.

SciTools Github


A Python WMS service for geospatial gridded data (Only triangular unstructured meshes and logically rectangular grids officially supported at this time).

Sensor Web


A reference implementation of the OGC Sensor Observation Service specification (version 2.0).


An OGC SOS server implementation written in Python. istSOS allows for managing and dispatch observations from monitoring sensors according to the Sensor Observation Service standard. The project provides also a Graphical user Interface that allows for easing the daily operations and a RESTful Web api for automatizing administration procedures.


A Python package for client programming with Open Geospatial Consortium (OGC) web service (hence OWS) interface standards, and their related content models. It supports WMS, WFS, WCS, WMC, SOS, SensorML, CSW, WPS, Filter, OWS Commmon, etc.


An OGC CSW server implementation written in Python that allows for the publishing and discovery of geospatial metadata, providing a standards-based metadata and catalogue component of spatial data infrastructures.


Python library for collecting Met/Ocean observations. Pyoos attempts to fill the need for a high level data collection library for met/ocean data publically available through many different websites and webservices.

Pyoos will collect and parse the following data services into the Paegan Discrete Geometry CDM:

  • IOOS SWE SOS 1.0 Services

    • ex. NcSOS instance:

    • ex. IOOS 52N instance:

  • NERRS Observations - SOAP

  • NDBC Observations - SOS

  • CO-OPS Observations - SOS

  • STORET Water Quality - WqxOutbound via REST (

  • USGS NWIS Water Quality - WqxOutbound via REST (

  • USGS Instantaneous Values - WaterML via REST

  • NWS AWC Observations - XML via REST (

  • HADS ( - limited to 7 day rolling window of data)


A client for Sensor Observation Services (SOS) as specified by the Open Geospatial Consortium (OGC). It allows users to retrieve metadata from SOS web services and to interactively create requests for near real-time observation data based on the available sensors, phenomena, observations et cetera using thematic, temporal and spatial filtering.


The Sparse Fast Fourier Transform is a recent algorithm developed by Hassanieh et al. [2, 3] for computing the the discrete Fourier Transforms on signals with a sparse (exact or approximately) frequency domain. The algorithm improves the asymptotic runtime compared to the prior methods based on pruning.


In high-end computing environments, remote file transfers of very large data sets to and from computational resources are commonplace as users are typically widely distributed across different organizations and must transfer in data to be processed and transfer out results for further analysis. Local transfers of this same data across file systems are also frequently performed by administrators to optimize resource utilization when new file systems come on-line or storage becomes imbalanced between existing file systems. In both cases, files must traverse many components on their journey from source to destination where there are numerous opportunities for performance optimization as well as failure. A number of tools exist for providing reliable and/or high performance file transfer capabilities, but most either do not support local transfers, require specific security models and/or transport applications, are difficult for individual users to deploy, and/or are not fully optimized for highest performance.

Shift is a framework for Self-Healing Independent File Transfer that provides high performance and resilience for local and remote transfers through a variety of techniques. These include end-to-end integrity via cryptographic hashes, throttling of transfers to prevent resource exhaustion, balancing transfers across resources based on load and availability, and parallelization of transfers across multiple source and destination hosts for increased redundancy and performance. In addition, Shift was specifically designed to accommodate the diverse heterogeneous environments of a widespread user base with minimal assumptions about operating environments. In particular, Shift is unique in its ability to provide advanced reliability and automatic single and multi-file parallelization to any stock command-line transfer application while being easily deployed by both individual users as well as entire organizations.


The Scalable HeterOgeneous Computing (SHOC) benchmark suite is a collection of benchmark programs testing the performance and stability of systems using computing devices with non-traditional architectures for general purpose computing. Its initial focus is on systems containing Graphics Processing Units (GPUs) and multi-core processors, and on the OpenCL programming standard. It can be used on clusters as well as individual hosts.


This project is a SimTK toolset providing general multibody dynamics capability, that is, the ability to solve Newton’s 2nd law F=ma in any set of generalized coordinates subject to arbitrary constraints. (That’s Isaac himself in the oval.) Simbody is provided as an open source, object-oriented C++ API and delivers high-performance, accuracy-controlled science/engineering-quality results.

Simbody uses an advanced Featherstone-style formulation of rigid body mechanics to provide results in Order(n) time for any set of n generalized coordinates. This can be used for internal coordinate modeling of molecules, or for coarse-grained models based on larger chunks. It is also useful for large-scale mechanical models, such as neuromuscular models of human gait, robotics, avatars, and animation. Simbody can also be used in real time interactive applications for biosimulation as well as for virtual worlds and games.

This toolset was developed originally by Michael Sherman at the Simbios Center at Stanford, with major contributions from Peter Eastman and others. Simbody descends directly from the public domain NIH Internal Variable Dynamics Module (IVM) facility for molecular dynamics developed and kindly provided by Charles Schwieters. IVM is in turn based on the spatial operator algebra of Rodriguez and Jain from NASA’s Jet Propulsion Laboratory (JPL), and Simbody has adopted that formulation.

See also PyCraft


SIMD.js is a new API being developed by Intel, Google, and Mozilla for JavaScript which introduces several new types and functions for doing SIMD computations. For example, the Float32x4 type represents 4 float32 values packed up together. The API contains functions to operate on those values together, including all the basic arithmetic operations, and operations to rearrange, load, and store such values. The intent is for browsers to implement this API directly, and provide optimized implementations that make use of SIMD instructions in the underlying hardware.

The SIMD.js API itself is in active development. The ecmascript_simd github repository is currently serving as a provision specification as well as providing a polyfill implementation to provide the functionality, though of course not the accelerated performance, of the SIMD API on existing browsers. It also includes some benchmarks which also serve as examples of basic SIMD.js usage.


Once you have generated a discrete problem you wish to translate this abstract formulation to specific code on a certain simulation platform. Simflowny is designed as an extensible framework on which plug-ins for different simulation platforms can be easily added. The current version provides support for the Cactus simulation framework and for the SAMRAI mesh management system. Both Cactus and SAMRAI provide parallelization by leveraging MPI-based communication between computers, which permits running simulations on clusters and taking advantage of multiple cores in modern chips.

Simflowny generates Fortran code for Cactus and C++ code for SAMRAI. It is also capable of compiling and linking a final binary that can be independently used as a simulation software. Alternatively, Simflowny also provides a GUI to manage simulations within the platform. Simulations may be launched locally, or remotely, by connecting to a Grid infrastructure.

Output both in Cactus and SAMRAI is mainly generated through HDF5 files, which contain snapshots from certain instants in the simulation. These results may be visualized with a number of commercial and free visualization tools.


This project provides tools for postprocessing data on triangular grids (simplex cells), such as computing meridional and barotropic stream functions and several transports through user defined slices. The data are interpolated onto a regular grid of user defined mesh size, equidistant in each (horizontal) coordinate direction. Postprocessing takes place on this regular grid.


A web-based scientific application deployment and visualization framework for coastal modeling and beyond.


SINGE is a Python 3 code. It computes, for full spheres and spherical shells, inertial and inertia-gravito modes in the mantle frame of reference. Boussinesq, homegeneous and viscous fluids are taken into account, with various different boundary conditions (no slip / stress-free for the velocity field, constant heat flux / isothermal for the temperature).

It uses a parallel pseudo-spectral approach in spherical geometry. The velociy field is projetcted onto poloidal and toroidal scalars, which are expanded on spherical harmonics in the angular directions and finite differences on an irregular mesh in the radial direction.


Skeleton programming is an approach where an application is written with the help of "skeletons". A skeleton is a pre-defined, generic component such as map, reduce, scan, farm, pipeline etc. that implements a common specific pattern of computation and data dependence, and that can be customized with (sequential) user-defined code parameters. Skeletons provide a high degree of abstraction and portability with a quasi-sequential programming interface, as their implementations encapsulate all low-level and platform-specific details such as parallelization, synchronization, communication, memory management, accelerator usage and other optimizations.

SkePU poster SkePU is an open-source skeleton programming framework for multicore CPUs and multi-GPU systems. It is a C++ template library with six data-parallel and one task-parallel skeletons, two generic container types, and support for execution on multi-GPU systems both with CUDA and OpenCL.


SkyNet is an efficient and robust neural network training code for machine learning. It is able to train large and deep feed-forward neural networks, including autoencoders, for use in a wide range of supervised and unsupervised learning applications, such as regression, classification, density estimation, clustering and dimensionality reduction. SkyNet is implemented in C/C++ and fully parallelised using MPI.

BAMBI (Blind Accelerated Multimodal Bayesian Inference) is a Bayesian inference engine that combines the benefits of SkyNet with MultiNest. It operated by simulateneously performing Bayesian inference using MultiNest and learning the likelihood function using SkyNet. Once SkyNet has learnt the likelihood to sufficient accuracy, inference finishes almost instantaneously.


SLEPc is a software library for the solution of large scale sparse eigenvalue problems on parallel computers. It is an extension of PETSc and can be used for linear eigenvalue problems in either standard or generalized form, with real or complex arithmetic. It can also be used for computing a partial SVD of a large, sparse, rectangular matrix, and to solve nonlinear eigenvalue problems (polynomial or general). Additionally, SLEPc provides solvers for the computation of the action of a matrix function on a vector. SLEPc is based on the PETSc data structures and it employs the MPI standard for message-passing communication.


slepc4py are Python bindings for SLEPc, the Scalable Library for Eigenvalue Problem Computations.


Slicer, or 3D Slicer, is a free, open source software package for visualization and image analysis.


The Surface Water Modeling System (SMS) is a comprehensive graphical environment for one-, two-, and three-dimensional hydrodynamic modeling. A pre- and post-processor for surface water modeling and design, SMS includes 2D finite element, 2D finite difference, 3D visualization modeling tools, and limited 1D support. Supported models include the USACE-ERDC supported TABS-MD (GFGEN, RMA2, RMA4, SED2D-WES), ADCIRC, ADH, CGWAVE, CMS-Flow, CMS-Wave, STWAVE, and PTM models. Comprehensive interfaces have also been developed for facilitating the use of the FHWA commissioned analysis FESWMS package. SMS also includes a generic model interface, which can be used to support models which have not been officially incorporated into the system.

The numeric models supported in SMS compute a variety of information applicable to surface water modeling. Primary applications of the models include calculation of water surface elevations and flow velocities for shallow water flow problems, for both steady-state or dynamic conditions. Additional applications include the modeling of contaminant migration, salinity intrusion, sediment transport (scour and deposition), wave energy dispersion, wave properties (directions, magnitudes and amplitudes) and others.

The SMS interface is composed of various modules which streamline the modeling process: Scatter Data, Map conceptualization, GIS, particle tracking, annotation, and the new raster module.


SocketCluster is an open source, multi-process realtime environment written in JavaScript (Node.js). You can build entire applications on top of it or you can use it alongside existing systems written in other languages.

SC supports both direct client-server communication (like and group communication via pub/sub channels.

SC is designed to scale both vertically across multiple CPU cores and horizontally across multiple machines/instances (via pub/sub channel synchronization).


SOFA is an Open Source framework primarily targeted at real-time simulation, with an emphasis on medical simulation. It is mostly intended for the research community to help develop newer algorithms, but can also be used as an efficient prototyping tool.

The SOFA architecture relies on several innovative concepts, in particular the notion of multi-model representation. In SOFA, most simulation components (deformable models, collision models, instruments, …​) can have several representations, connected together through a mechanism called mapping. Each representation can then be optimized for a particular task (e.g. collision detection, visualization) while at the same time improving interoperability by creating a clear separation be tween the functional aspects of the simulation components. As a consequence, it is possible to have models of very different nature interact together, for instance rigid bodies, deformable objects, and fluids. At a finer level of granularity, we also propose a decomposition of physical models (i.e. any model that behaves according to the laws of physics) into a set of basic components. This decomposition leads for instance to a representation of mechanical models as a set of degrees of freedom and force fields acting on these degrees of freedom. Another key aspect of SOFA is the use of a scene-graph to organize and process the elements of a simulation while clearly separating the computation tasks from their possibly parallel scheduling.

Software Collections

Software Collections give you power to build, install, and use multiple versions of software on the same system, without affecting system-wide installed packages.


Somoclu is a massively parallel tool for training self-organizing maps on large data sets written in C++. It builds on OpenMP for multicore execution, and on MPI for distributing the workload across the nodes in a cluster. It is also able to boost training by using CUDA if graphics processing units are available. A sparse kernel is included, which is useful for high-dimensional but sparse data, such as the vector spaces common in text mining workflows. Python, R and MATLAB interfaces facilitate interactive use. Apart from fast execution, memory use is highly optimized, enabling training large emergent maps even on a single node.


A flexible package manager designed to support multiple versions, configurations, platforms, and compilers.

Spack is a package management tool designed to support multiple versions and configurations of software on a wide variety of platforms and environments. It was designed for large supercomputing centers, where many users and application teams share common installations of software on clusters with exotic architectures, using libraries that do not have a standard ABI. Spack is non-destructive: installing a new version does not break existing installations, so many configurations can coexist on the same system.


Apache Spark is an open-source cluster computing framework originally developed in the AMPLab at UC Berkeley. In contrast to Hadoop’s two-stage disk-based MapReduce paradigm, Spark’s in-memory primitives provide performance up to 100 times faster for certain applications.[1] By allowing user programs to load data into a cluster’s memory and query it repeatedly, Spark is well suited to machine learning algorithms.[2]

Spark requires a cluster manager and a distributed storage system. For cluster manager, Spark supports standalone (native Spark cluster), Hadoop YARN, or Apache Mesos.[3] For distributed storage, Spark can interface with a wide variety, including Hadoop Distributed File System (HDFS),[4] Cassandra,[5] OpenStack Swift, and Amazon S3. Spark also supports a pseudo-distributed mode, usually used only for development or testing purposes, where distributed storage is not required and the local file system can be used instead; in the scenario, Spark is running on a single machine with one worker per CPU core.


SPASM is a templatised C++ library for the storage and manipulation of a variety of probabilistic representations. Represenationtations currently considered are: Gaussian, Gaussian Mixtures, Parzen density Estimates, Particles and Discrete grids. this library will include:

  • Storage classes for PDF’s and likelihoods

  • Basic operators such as multiplication, division, convolution and addition.

  • Conversion between common types (e.g. Gaussian)

  • Information measures (such as entropy etc.)

  • Distance measures

  • Complexity Reduction (Gaussian Mixtures have a habit of growing after common operations)

Other libraries for Bayesian filtering exist such as Bayes++ and BFL . These libraries are focused more on filtering in general and not the methods used for storage and manipulation of a variety of representations.

This library has been motivated by robotics applications, such as feature tracking, SLAM and other forms of localisation. Numerous other applications exist that require probability representations, such a statistical learning.

Spherical Harmonics Manipulator

This software computes synthesis of spherical harmonics models on sparse coordinates or grids (provided in a geodetic or geocentric reference system). It exploits basic parallelism using openmp directives. A binary that requires the MCR library shown below.


Tool for estimation of statistical characteristics of multivariate random functions.

This paper examines the feasibility of high-level Python based utilities for numerically intensive applications via an example of a multidimensional integration for the evaluation of the statistical characteristics of a random variable. We discuss the approaches to the implementation of mathematically formulated incremental expressions using high-level scripting code and low-level compiled code. Due to the dynamic typing of the Python language, components of the algorithm can be easily coded in a generic way as algorithmic templates. Using the Enthought Development Suite they can be effectively assembled into a flexible computational framework that can be configured to execute the code for arbitrary combinations of integration schemes and versions of instantiated code. The paper describes the development cycle using a simple running example involving averaging of a random two-parametric function that includes discontinuity. This example is also used to compare the performance of the available algorithmic and executional features.


SQuadGen is a mesh generation utility that uses a cubed-sphere base mesh to generate quadrilateral meshes with user-specified enhancements. In order to determine where enhancement is desired, the user provides a PNG file which corresponds to a latitude-longitude grid. Raster values with higher brightness (whiter values) are tagged for refinement. The algorithm uses a basic paving technique and supports two paving stencil types: Low-connectivity (LOWCONN) and CUBIT-type transition regions.


SRM-Lite is a simple command-line based tool with pluggable file transfer protocol supports. SRM-Lite supports scp and sftp in high performance way (hpn-ssh).


A probabilistic programming language for Bayesian inference written in C++.[1] The Stan language is used to specify a Bayesian statistical model, which is an imperative declaration of the log probability density function. It has interfaces for R and Python as well as a command-line interface.


STELLA is a strongly typed, object-oriented, Lisp-like language, designed to facilitate symbolic programming tasks in artificial intelligence applications. STELLA preserves those features of Common Lisp deemed essential for symbolic programming such as built-in support for dynamic data structures, heterogeneous collections, first-class symbols, powerful iteration constructs, name spaces, an object-oriented type system with a meta-object protocol, exception handling, and language extensibility through macros, but without compromising execution speed, interoperability with non-STELLA programs, and platform independence. STELLA programs are translated into a target language such as C++, Common Lisp, or Java, and then compiled with the native target language compiler to generate executable code. The language constructs of STELLA are restricted to those that can be translated directly into native constructs of the intended target languages, thus enabling the generation of highly efficient as well as readable code.


StocPy is an expressive probabilistic programming language, provided as a Python library.

We introduce the first, general purpose, slice sampling inference engine for probabilistic programs. This engine is released as part of StocPy, a new Turing-Complete probabilistic programming language, available as a Python library. We present a transdimensional generalisation of slice sampling which is necessary for the inference engine to work on traces with different numbers of random variables. We show that StocPy compares favourably to other PPLs in terms of flexibility and usability, and that slice sampling can outperform previously introduced inference methods. Our experiments include a logistic regression, HMM, and Bayesian Neural Net.


Streamline Version 4 is a versatile Fortran 77 & C++ program for calculating charged test particle trajectories or field-lines for user-specified fields using the test-particle method. The user has the freedom to specify any type of field (analytical, tabulated in files, time dependent, etc.) and maintains complete control over initial conditions of trajectories/field-lines and boundary conditions of specified fields. The structure of Streamline was redesigned from previous versions in order to know not only particle or field-lines positions and velocities at each step of the simulations, but also the instantaneous field values as seen by particles. This was made to compute the instantaneous value of the particle’s magnetic moment, but other applications are possible too. Accuracy tests of the code are shown for different cases, i.e., particles moving in constant magnetic field, magnetic plus constant electric field and wave field. In addition in the last part of the paper we concentrate our discussion on the study of velocity space diffusion of charged particles in turbulent slab fields, paying attention to the discretization of the fields and the temporal discretization of the dynamical equations. The diffusion of charged particles is a very common topic in plasma physics and astrophysics since it plays an important role in many different phenomena such as stochastic particle acceleration, diffusive shock acceleration, solar energetic particle propagation, and the scattering required for the solar modulation of galactic cosmic rays.

Structure Synth

Structure Synth is a cross-platform application for generating 3D structures by specifying a design grammar. Even simple systems may generate surprising and complex structures. Structure Synth is built in C++, OpenGL, and Qt 4.

See also Context Free Art.


SuperCollider is an environment and programming language for real time audio synthesis and algorithmic composition. It provides an interpreted object-oriented language which functions as a network client to a state of the art, realtime sound synthesis server.

SuperCollider is an environment and programming language originally released in 1996 by James McCartney for real-time audio synthesis and algorithmic composition. Since then it has been evolving into a system used and further developed by both scientists and artists working with sound. It is an efficient and expressive dynamic programming language providing a framework for acoustic research, algorithmic music, and interactive programming.


An object-oriented particle dynamics code SYMPLER. With this freely available software, simulations can be performed ranging from microscopic classical molecular dynamics up to the Lagrangian particle-based discretisation of macroscopic continuum mechanics equations. We show how the runtime definition of arbitrary degrees of freedom and of arbitrary equations of motion allows for modular and symbolic computation with high flexibility. Arbitrary symbolic expressions for inter-particle forces can be defined as well as fluxes of arbitrarily many additional scalar, vectorial or tensorial degrees of freedom. The integration in a high performance grid computing environment makes huge geographically distributed computational resources accessible to the software by an easy-to-use interface.


TACO is an object oriented control system originally developed at the European Synchrotron Radiation Facility (ESRF) to control accelerators and beamlines and data acquisition systems. TACO is very scalable and can be used for simple single device laboratory like setups with only a few devices or for a big installation comprising thousands of devices. TACO is a cheap and simple solution for doing distributed home automation. TACO is available free of charge without warranties.

TACO is object oriented because it treats ALL (physical and logical) control points in a control system as objects in a distributed environment. All actions are implement in classes. New classes can be constructed out of existing classes in a hierarchical manner thereby ensuring a high-level of software reuse. Classes can be written in C++, in C (using a methodology called Objects in C), in Python or in LabView (using G).

This has been largedly superseded by TANGO.


Tahoe-LAFS is a Free and Open decentralized cloud storage system. It distributes your data across multiple servers. Even if some of the servers fail or are taken over by an attacker, the entire file store continues to function correctly, preserving your privacy and security.


Taiga greatly simplifies the use of science data. It is a self-sufficient bundle of free/open source software that webifies major scientific data formats, such as NetCDF, HDF4 and HDF5. Through webification (w10n), meta attributes and data arrays inside a file can be directly retrieved, transformed, or manipulated using clear and meaningful URLs.


TakTuk is a tool for deploying parallel remote executions of commands to a potentially large set of remote nodes. It spreads itself using an adaptive algorithm and sets up an interconnection network to transport commands and perform I/Os multiplexing/demultiplexing. The TakTuk mechanics dynamically adapt to environment (machine performance and current load, network contention) by using a reactive work-stealing algorithm that mixes local parallelization and work distribution.

TakTuk is a tool especially suited to the administration of parallel machines because it eases the handling of groups of hosts. It might be used in batch mode for simple machine state tests (e.g. test hosts responsiveness simply by letting the engine setting the network up using its default connector - ssh) or in interactive mode for deep investigation on several hosts, using the TakTuk command interpreter to execute multiple commands on multiple hosts (standard test sequence on a group of hosts, ping pong test between several machines, …​).


TenEig is a MATLAB toolbox to find eigenpairs of a tensor. TenEig is written in MATLAB based on PSOLVE and provides routines for solving different kinds of eigenvalue problems of a tensor, like E-eigenvalues, eigenvalues, Z-eigenvalues (real E-eigenvalues), H-eigenvalues (real eigenvalues), or more general mode-k eigenvalues. The corresponding eigenpairs are also provided.


A Matlab toolbox for tensor computations.


The Python Tensor Toolbox provides functionalities for the decomposition of tensors in tensor-train format [1] and spectral tensor-train format [2].

Tensor Toolbox

The Tensor Toolbox provides the following classes for manipulating dense, sparse, and structured tensors using MATLAB’s object-oriented features:

  • tensor - A (dense) multidimensional array (extends MATLAB’s current capabilities).

  • sptensor - A sparse multidimensional array.

  • tenmat - Store a tensor as a matrix, with extra information so that it can be converted back into a tensor.

  • sptenmat - Store an sptensor as a sparse matrix in coordinate format, with extra information so that it can be converted back into an sptensor.

  • ttensor - Store a tensor decomposed as a Tucker operator

  • ktensor - Store a tensor decomposed as a Kruskal operator


Terra is a new low-level system programming language that is designed to interoperate seamlessly with the Lua programming language.

Like C, Terra is a simple, statically-typed, compiled language with manual memory management. But unlike C, it is designed from the beginning to interoperate with Lua. Terra functions are first-class Lua values created using the terra keyword. When needed they are JIT-compiled to machine code.


An open source platform for working with collections of texts. It enables students, researchers and teachers to share and collaborate around texts using a simple and intuitive interface. TEXTUS currently enables users to:

  • Collaboratively annotate texts and view the annotations of others

  • Reliably cite electronic versions of texts

  • Create bibliographies with stable URLs to online versions of those texts




NcSOS adds an OGC SOS service to datasets in your existing THREDDS server. It complies with the IOOS SWE Milestone 1.0 templates and requires your datasets be in any of the CF 1.6 Discrete Sampling Geometries.

NcSOS acts like other THREDDS services (such an OPeNDAP and WMS) where as there are individual service endpoints for each dataset. It is best to aggregate your files and enable the NcSOS service on top of the aggregation. i.e. The NcML aggregate of hourly files from an individual station would be a good candidate to serve with NcSOS. Serving the individual hourly files with NcSOS would not be as beneficial.


Styles for a THREDDS server.


An interface definition language and binary communication protocol that is used to define and create services for numerous languages. It is used as a remote procedure call (RPC) framework and was developed at Facebook for "scalable cross-language services development". It combines a software stack with a code generation engine to build services that work efficiently to a varying degree and seamlessly between C#, C++ (on POSIX-compliant systems), Cappuccino, Cocoa, Delphi, Erlang, Go, Haskell, Java, Node.js, OCaml, Perl, PHP, Python, Ruby and Smalltalk. Although developed at Facebook, it is now an open source project in the Apache Software Foundation.

Thrift includes a complete stack for creating clients and servers. The top part is generated code from the Thrift definition. The services generate from this file client and processor code. In contrast to built-in types, created data structures are sent as result in generated code. The protocol and transport layer are part of the runtime library. With Thrift, it is possible to define a service and change the protocol and transport without recompiling the code. Besides the client part, Thrift includes server infrastructure to tie protocols and transports together, like blocking, non-blocking, and multi-threaded servers. The underlying I/O part of the stack is differently implemented for different languages.


Thrust is a parallel algorithms library which resembles the C++ Standard Template Library (STL). Thrust’s high-level interface greatly enhances developer productivity while enabling performance portability between GPUs and multicore CPUs. Interoperability with established technologies (such as CUDA, TBB and OpenMP) facilitates integration with existing software.


Library for the Fourier interpolation of regular three-dimensional data-sets.


TinyOS is an "operating system" designed for low-power wireless embedded systems. Fundamentally, it is a work scheduler and a collection of drivers for microcontrollers and other ICs commonly used in wireless embedded platforms.


Topic modeling is a machine learning method that learns underlying themes in a collection of documents, which can be used to summarize and organize the documents. We have created a method for visualizing topic models, allowing users to explore a corpus by navigating between high level topic descriptions and individual documents, hopefully deepening their understanding of the corpus.


With Toolboxes for Complex Systems we provide a compilation of innovative methods for modern nonlinear data analysis. These methods were developed during scientific research in the Interdisciplinary Center for Dynamics of Complex Systems Potsdam, the Cardiovascular Physics Group at the Humboldt-Universität zu Berlin, and the Potsdam Institute for Climate Impact Research (PIK).


With the rise of governmental monitoring programs, Tox, a FOSS initiative, aims to be an easy to use, all-in-one communication platform that ensures their users full privacy and secure message delivery. The goal of this project is to create a configuration-free P2P Skype replacement. “Configuration-free” means that the user will simply have to open the program and will be capable of adding people and communicating with them without having to set up an account.


Python binding for Project Tox.


Transmageddon is a video transcoder for Linux and Unix systems built using GStreamer. It supports almost any format as its input and can generate a very large host of output files. The goal of the application was to help people to create the files they need to be able to play on their mobile devices and for people not hugely experienced with multimedia to generate a multimedia file without having to resort to command line tools with ungainly syntaxes.


TsunAWI was developped in the framework of the GITEWS project (German-Indonesian Tsunami Early Warning System). It discretizes the non-linear shallow equations on an unstructured triangular mesh and allows to simulate the propagation of tsunamis from the origin to the inundation on land. It was used to calculate 4500 scenarios for the Indonesian tsunami early warning system.


TT(Tensor Train) format is an efficient way for low-parametric representation of high-dimensional tensors. The TT-Toolbox is a MATLAB implementation of basic operations with tensors in TT-format.


Python implementation of the TT-Toolbox.


Users want to share more and more photos and videos. But mobile networks are fragile. Platform APIs are a mess. Every project builds its own file uploader. A thousand one week projects that barely work, when all we need is one real project, done right.

We are going to do this right. We will solve reliable file uploads for once and for all. A new open protocol for resumable uploads built on HTTP. Simple, cheap, reusable stacks for clients and servers. Any language, any platform, any network.


Twisted is an event-driven networking engine written in Python and licensed under the open source.


The Uintah software suite is a set of libraries and applications for simulating and analyzing complex chemical and physical reactions. These reactions are modeled by solving partial differential equations on structured adaptive grids using hundreds to thousands of processors (though smaller simulations may also be run on a scientist’s desktop computer). Key software applications have been developed for exploring the fine details of metal containers (encompassing energetic materials) embedded in large hydrocarbon fires. Uintah’s underlying technologies have led to novel techniques for understanding large pool eddy fires as well as new methods for simulating fluid-structure interactions. The software is general purpose in nature and the breadth of simulation domains continues to grow beyond the original focus of the C-SAFE initiative.

User-Mode Linux

User-Mode Linux is a safe, secure way of running Linux versions and Linux processes. Run buggy software, experiment with new Linux kernels or distributions, and poke around in the internals of Linux, all without risking your main Linux setup.

User-Mode Linux gives you a virtual machine that may have more hardware and software virtual resources than your actual, physical computer. Disk storage for the virtual machine is entirely contained inside a single file on your physical machine. You can assign your virtual machine only the hardware access you want it to have. With properly limited access, nothing you do on the virtual machine can change or damage your real computer, or its software.


UV-CDAT is a powerful and complete front-end to a rich set of visual-data exploration and analysis capabilities well suited for climate-data analysis problems.


Vagrant is a tool for building complete development environments. With an easy-to-use workflow and focus on automation, Vagrant lowers development environment setup time, increases development/production parity, and makes the "works on my machine" excuse a relic of the past.

Vagrant provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.

To achieve its magic, Vagrant stands on the shoulders of giants. Machines are provisioned on top of VirtualBox, VMware, AWS, or any other provider. Then, industry-standard provisioning tools such as shell scripts, Chef, or Puppet, can be used to automatically install and configure software on the machine.


The Vaucanson platform VCSN is a software dedicated to the computation of, and with, finite state machines. Here finite state machines is to be understood in the broadest possible sense: finite automata with output — often called transducers then — or even more generally finite automata with multiplicity, that is, automata that not only accept, or recognize, sequences of symbols but compute for every such sequence a value that is associated with it and which can be taken in any semiring. Hence the variety of situations that can thus be modellized.

VCSN has been designed with (at least) three goals in mind: to allow generic programming of a wide class of finite automata, to provide a language close to the mathematical description of algorithms on automata, to be a free and open software.


We present a new package, VEST (Vector Einstein Summation Tools), that performs abstract vector calculus computations in Mathematica. Through the use of index notation, VEST is able to reduce three-dimensional scalar and vector expressions of a very general type to a well defined standard form. In addition, utilizing properties of the Levi-Civita symbol, the program can derive types of multi-term vector identities that are not recognized by reduction, subsequently applying these to simplify large expressions.


ViennaCL is a free open-source linear algebra library for computations on many-core architectures (GPUs, MIC) and multi-core CPUs. The library is written in C++ and supports CUDA, OpenCL, and OpenMP. In addition to core functionality and many other features including BLAS level 1-3 support and iterative solvers, the latest release family ViennaCL 1.6.x provides fast pipelined iterative solvers including fast sparse matrix-vector products based on CSR-adaptive, a new fully HTML-based documentation, and a new sparse matrix type. Also, a Python wrapper named PyViennaCL is available.


VIGRA stands for "Vision with Generic Algorithms". It’s an image processing and analysis library that puts its main emphasis on customizable algorithms and data structures. VIGRA is especially strong for multi-dimensional images, because many algorithms (e.g. filters, feature computation, superpixels) are implemented for arbitrary high dimensions. By using template techniques similar to those in the C++ Standard Template Library, you can easily adapt any VIGRA component to the needs of your application, without thereby giving up execution speed. As of version 1.7.1, VIGRA also provides extensive Python bindings on the basis of the popular numpy framework.


VirtualBox is a powerful x86 and AMD64/Intel64 virtualization product for enterprise as well as home use. Not only is VirtualBox an extremely feature rich, high performance product for enterprise customers, it is also the only professional solution that is freely available as Open Source Software under the terms of the GNU General Public License (GPL) version 2.

Presently, VirtualBox runs on Windows, Linux, Macintosh, and Solaris hosts and supports a large number of guest operating systems including but not limited to Windows (NT 4.0, 2000, XP, Server 2003, Vista, Windows 7, Windows 8), DOS/Windows 3.x, Linux (2.4, 2.6 and 3.x), Solaris and OpenSolaris, OS/2, and OpenBSD. VirtualBox is being actively developed with frequent releases and has an ever growing list of features, supported guest operating systems and platforms it runs on.


The NASA Vision Workbench (VW) is a general purpose image processing and computer vision library developed by the Autonomous Systems and Robotics (ASR) Area in the Intelligent Systems Division at the NASA Ames Research Center.


Vispy is a high-performance interactive 2D/3D data visualization library. Vispy leverages the computational power of modern Graphics Processing Units (GPUs) through the OpenGL library to display very large datasets.


Vistle, the VISualization Testing Laboratory for Exascale computing, is an extensible software environment that integrates simulations on supercomputers, post-processing and parallel interactive visualization.

It is under active development at HLRS since 2012 within the European project CRESTA and bwVisu. The objective is to provide a highly scalable successor to COVISE, exploiting data, task and pipeline parallelism in hybrid shared and distributed memory environments with acceleration hardware. Domain decompositions used during simulation can be reused for visualization.

A Vistle work flow consists of several processing modules, each of which is a parallel MPI program that uses OpenMP within nodes. These can be configured graphically or from Python. Shared memory is used for transfering data between modules on a single node. Work flows can be distributed across several clusters.

For rendering in immersive projection systems, Vistle uses OpenCOVER. Visualization parameters can be manipulated from within the virtual environment. Large data sets can be displayed with OpenGL sort-last parallel rendering and depth compositing. For scaling with the simulation on remote HPC resources, a CPU based hybrid sort-last/sort first parallel ray casting renderer is available. "Remote hybrid rendering" allows to combine its output with local rendering, while ensuring smooth interactivity by decoupling it from remote rendering.

The Vistle system is modular and can be extended easily with additional visualization algorithms. Source code is available on GitHub and licensed under the LPGL.


COVISE, the collaborative visualization and simulation environment, is a modular distributed visualization system. As its focus is on visualization of scientific data in virtual environments, it comprises the VR renderer OpenCOVER.


Web-based Python Data Analysis.

A Distributed Approach to Ocean Etc. Model Data Interoperability -

Testing pyoos ioos.get_observation parser with a Milestone 1 XML template file -

Creating your own IPython Notebook Blog environment on Wakari -

Testing IOOS Infrastructure with Wakari -

Installing Iris on Wakari -

Versioning Wakari Notesbooks on Github -

Running a Shared Wakari Notebook/Environment -


 Walrus is a tool for interactively visualizing large directed graphs in
three-dimensional space. It is technically possible to display graphs
containing a million nodes or more, but visual clutter, occlusion, and other
factors can diminish the effectiveness of Walrus as the number of nodes, or
the degree of their connectivity, increases. Thus, in practice, Walrus is best
suited to visualizing moderately sized graphs that are nearly trees. A graph
with a few hundred thousand nodes and only a slightly greater number of links
is likely to be comfortable to work with.

Walrus computes its layout based on a user-supplied spanning tree. Because the specifics of the supplied spanning tree greatly affect the resulting display, it is crucial that the user supply a spanning tree that is both meaningful for the underlying data and appropriate for the desired insight. The prominence and orderliness that Walrus gives to the links in the spanning tree, in contrast to all other links, means that an arbitrarily chosen spanning tree may create a misleading or ineffective visualization. Ideally, the input graphs should be inherently hierarchical.

Walrus uses 3D hyperbolic geometry to display graphs under a fisheye-like distortion. At any moment, the amount of magnification, and thus the level of visible detail, varies across the display. This allows the user to examine the fine details of a small area while always having a view of the whole graph available as a frame of reference. Graphs are rendered inside a sphere that contains the Euclidean projection of 3D hyperbolic space. Points within the sphere are magnified according to their radial distance from the center. Objects near the center are magnified, while those near the boundary are shrunk. The amount of magnification decreases continuously and at an accelerated rate from the center to the boundary, until objects are reduced to zero size at the latter, which represents infinity. By bringing different parts of a graph to the magnified central region, the user can examine every part of the graph in detail.


WCSAxes is a framework for making plots of Astronomical data in Matplotlib.


Full duplex messaing between web browsers and servers. It takes care of handling the WebSocket connections, launching your programs to handle the WebSockets, and passing messages between programs and web-browser. It’s like CGI, twenty years later, for WebSockets.


This tool provides an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words. These representations can be subsequently used in many natural language processing applications and for further research.

The word2vec tool takes a text corpus as input and produces the word vectors as output. It first constructs a vocabulary from the training text data and then learns vector representation of words. The resulting word vector file can be used as features in many natural language processing and machine learning applications.


This tool from NASA’s EOSDIS provides the capability to interactively browse global, full-resolution satellite imagery and then download the underlying data. Most of the 100+ available products are updated within three hours of observation, essentially showing the entire Earth as it looks "right now". This supports time-critical application areas such as wildfire management, air quality measurements, and flood monitoring. Arctic and Antarctic views of several products are also available for a "full globe" perspective. Browsing on tablet and smartphone devices is generally supported for mobile access to the imagery.

Worldview uses the Global Imagery Browse Services (GIBS) to rapidly retrieve its imagery for an interactive browsing experience. While Worldview uses OpenLayers as its mapping library, GIBS imagery can also be accessed from Google Earth, NASA World Wind, and several other clients. We encourage interested developers to build their own clients or integrate NASA imagery into their existing ones using these services.


This code is a reference implementation of our paper Wavelet Turbulence for Fluid Simulation. The code is intended as a pedagogical example, so clarity has been given preference over performance. Optimizations that inhibit readability have been removed, so the running times experienced will be longer than those reported in the paper.


XBeach is a two-dimensional model for wave propagation, long waves and mean flow, sediment transport and morphological changes of the nearshore area, beaches, dunes and backbarrier during storms.


Scientists at LLNL have developed an open source, non-intrusive, and general purpose parallel-in-time code, XBraid. The algorithm enables a scalable parallel-in-time approach by applying multigrid to the time dimension. It is designed to be nonintrusive. That is, users apply their existing sequential time-stepping code according to our interface, and then XBraid does the rest. Users have spent years, sometimes decades, developing the right time-stepping scheme for their problem. XBraid allows users to keep their schemes, but enjoy parallelism in the time dimension.

Traditional sequential time-marching algorithms are a critical part of any computer simulation of a time-dependent problem, but these algorithms are currently facing a sequential bottleneck. This bottleneck is driven by the broad trend that future performance gains will come from greater concurrency, not faster clock speeds. Previously, ever-increasing clock speeds decreased the compute time for each time step, thus allowing more time steps to be calculated without increasing the overall compute time. Now that clock speeds are stagnant, further refinements in time (i.e., increases in the number of time steps) will simply increase the simulation’s overall compute time. Many of these refinements in time will be required to maintain balance between spatial and temporal accuracies. Additionally, some simulations are already fully resolved in space, and it is unclear how such simulations will take advantage of the coming increases in concurrency.

LLNL researchers have advanced an alternative solution—solving all of the time steps simultaneously, with the help of a new multilevel algorithm and the massively parallel processing capabilities of current and future high-performance computers. This approach has already shown an ability to dramatically decrease the solution time for some simulations by ten-fold or more.


The Xen Project hypervisor is an open-source type-1 or baremetal hypervisor, which makes it possible to run many instances of an operating system or indeed different operating systems in parallel on a single machine (or host). The Xen Project hypervisor is the only type-1 hypervisor that is available as open source. It is used as the basis for a number of different commercial and open source applications, such as: server virtualization, Infrastructure as a Service (IaaS), desktop virtualization, security applications, embedded and hardware appliances. The Xen Project hypervisor is powering the largest clouds in production today.

The hypervisor supports running two different types of guests: Paravirtualization (PV) and Full or Hardware assisted Virtualization (HVM). Both guest types can be used at the same time on a single hypervisor. It is also possible to use techniques used for Paravirtualization in an HVM guest: essentially creating a continuum between PV and HVM. This approach is called PV on HVM.

Paravirtualization (PV) is an efficient and lightweight virtualization technique originally introduced by Xen Project, later adopted by other virtualization platforms. PV does not require virtualization extensions from the host CPU. However, paravirtualized guests require a PV-enabled kernel and PV drivers, so the guests are aware of the hypervisor and can run efficiently without emulation or virtual emulated hardware. PV-enabled kernels exist for Linux, NetBSD, FreeBSD and OpenSolaris. Linux kernels have been PV-enabled from 2.6.24 using the Linux pvops framework. In practice this means that PV will work with most Linux distributions (with the exception of very old versions of distros).

Full Virtualization or Hardware-assisted (HVM) virtualizion uses virtualization extensions from the host CPU to virtualize guests. HVM requires Intel VT or AMD-V hardware extensions. The Xen Project software uses Qemu to emulate PC hardware, including BIOS, IDE disk controller, VGA graphic adapter, USB controller, network adapter etc. Virtualization hardware extensions are used to boost performance of the emulation. Fully virtualized guests do not require any kernel support. This means that Windows operating systems can be used as a Xen Project HVM guest. Fully virtualized guests are usually slower than paravirtualized guests, because of the required emulation.

Mirage OS

Mirage is an exokernel (also called a Cloud Operating System) for constructing secure, high-performance network applications across a variety of cloud computing, embedded and mobile platforms. Mirage OS was initially designed to for cloud use, which is why we call it a Cloud Operating System. Mirage OS applications are developed in a high-level functional programming language (OCaml) on a desktop OS such as Linux or Mac OSX, and is then compiled into a fully-standalone, specialised microkernel. These microkernels run directly on Xen Project hypervisor APIs. Since the Xen Project powers most public clouds such as Amazon EC2, Rackspace Cloud, and many others, Mirage lets your servers run more cheaply, securely and faster in any Xen Project based cloud or hosting service.


XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world’s largest clouds and enterprises. XenServer is an enterprise-class, cloud-proven, virtualization platform that delivers all of the critical features needed for any server and datacenter virtualization implementation.


XIOS stands for XML-IO-SERVER, a library dedicated to I/O management of climate codes and model output.


A software package that allows the fast and easy solution of sets of ordinary, partial and stochastic differential equations, using a variety of efficient numerical algorithms. XMDS2 is a cross-platform, GPL-licensed, open source package for numerically integrating initial value problems that range from a single ordinary differential equation up to systems of coupled stochastic partial differential equations. The equations are described in a high-level XML-based script, and the package generates low-level optionally parallelised C++ code for the efficient solution of those equations. It combines the advantages of high-level simulations, namely fast and low-error development, with the speed, portability and scalability of hand-written code.


The Extensible Messaging and Presence Protocol (XMPP) is an open technology for real-time communication, which powers a wide range of applications including instant messaging, presence, multi-party chat, voice and video calls, collaboration, lightweight middleware, content syndication, and generalized routing of XML data.


XSHELLS is a high performance simulation code for the rotating Navier-Stokes equation in spherical shells, optionally coupled to the induction and temperature equation.


The X-Stack Program was created to support research that targets significant advances in programming models, languages, compilers, runtime systems and tools. The expected results of this program are complete solutions to the system software stack for Exascale computing platforms (X-Stack)which address fundamental challenges identified in the ASCR Exascale Programming Challenges Workshop, captured in the workshop report, as well as the ones identified in the ASCR Exascale Tools Workshop, captured in the workshop report. Solutions being researched involve radically new approaches to programming Exascale applications and algorithms and will demonstrate the viability of such solutions in a broad high performance programming context and will enable automatic semantics and performance preserving transformations of applications (possibly with users in the loop).


ROSE is an open source compiler infrastructure to build source-to-source program transformation and analysis tools for large-scale C(C89 and C98), C(C98 and C++11), UPC, Fortran (77/95/2003), OpenMP, Java, Python and PHP applications.


We are building a generic, extensible compiler infrastructure that can incorporate semantic information from domain-specific libraries to enable transformations that leverage domain-specific properties of library methods. Rather than building domain-specific compilers for each domain, our extensible compiler becomes a domain specific compiler for a domain when paired with domain-specific libraries.


Yael is a library implementing computationally intensive functions used in large scale image retrieval, such as neighbor search, clustering and inverted files. The library offers interfaces for C, Python and Matlab.


Yesod is a Haskell web framework for productive development of type-safe, RESTful, high performance web applications.


LambdaCms is a set of packaged libraries, containing subsites for the Yesod application framework, which allow rapid development of robust and highly performant websites with content management functionality.


Yorick is an interpreted programming language for scientific simulations or calculations, postprocessing or steering large simulation codes, interactive scientific graphics, and reading, writing, or translating large files of numbers. Yorick includes an interactive graphics package, and a binary file package capable of translating to and from the raw numeric formats of all modern computers. Yorick is written in ANSI C and runs on most operating systems.


The yt project aims to produce an integrated science environment for collaboratively asking and answering astrophysical questions. To do so, it will encompass the creation of initial conditions, the execution of simulations, and the detailed exploration and visualization of the resultant data. It will also provide a standard framework based on physical quantities interoperability between codes.


Open Source ESB, SOA, REST, APIs and Cloud Integrations in Python. Build and orchestrate integration services, expose new or existing APIs, either cloud or on-premise, and use a wide range of connectors, data formats and protocols.

Zato facilitates intercommunication across applications and data sources spanning your organization’s business or technical boundaries and beyond, enabling you to access, design, develop or discover new opportunities and processes.


ZKCM is a C++ library developed for the purpose of multiprecision matrix computation, on the basis of the GNU MP and MPFR libraries. It provides an easy-to-use syntax and convenient functions for matrix manipulations including those often used in numerical simulations in quantum physics. Its extension library, ZKCM_QC, is developed for simulating quantum computing using the time-dependent matrix-product-state simulation method.

Original Section

Notes about how to install and use cool software.

Machine Learning and Data Mining Info

Discovering and Visualizing Patterns with Python

PDF cheatsheet (7 pp.)

Weblog of cheatsheet author:

Machine Learning: An Algorithmic Perspective

A book with lots of Python examples, the code for which is available at the link shown.

Neural Network Emulations for Complex Multidimensional Geophysical Mappings

PDF review paper (34 pp.)

Predicting Solar Energy from Weather Forecasts Using Python

Using Python to read data from NetCDF files and then perform data mining.

Application of Machine Learning Methods to Spatial Interpolation of Environmental Variables

PDF paper (13 pp.)

Review of Spatial Interpolation Methods for Environmental Scientists

PDF technical report (154 pp.)

Climate Informatics

PDF review paper (46 pp.)

Comparing Predictive Power in Climate Data: Clustering Matters

PDF paper (17 pp.)

Applying Machine Learning Methods to Climate Variability

Nonlinear Multivariate and Time Series Analysis by Neural Network Methods

Pattern Recognition in Time Series

PDF paper (28 pp.)

Application of Statistical Learning to Plankton Image Analysis

Machine Learning Algorithms for Real Data Sources with Applications to Climate Science

PDF slides (46 pp.)

Machine Learning for Climate Science

Online slides (196 pp.)

Applicability of Data Mining Techniques for Climate Prediction

PDF paper (4 pp.)

Outstanding Problems at the Interface of Climate Prediction and Data Mining

Online slides (35 pp.)

Unsupervised Machine Learning Techniques for Studying Climate Variability

PDF slides (21 pp.)

Tracking Climate Models

PDF paper (15 pp.)

Streaming Data Mining

PDF slides (229 pp.)

Machine Learning for Hackers

Book (324 pp.) with examples using R.

Python and Matlab


A software framework in Fortran to build large-scale parallel applications. It
is designed for applications using three-dimensional structured mesh and
spatially implicit numerical algorithms. At the foundation it implements a
general-purpose 2D pencil decomposition for data distribution on
distributed-memory platforms. On top it provides a highly scalable and
efficient interface to perform three-dimensional distributed FFTs. The library
is optimised for supercomputers and scales well to hundreds of thousands of
cores. It relies on MPI but provides a user-friendly programming interface
that hides communication details from application developers.

See xref:Incompact3d[Incompact3d].[+][+]


A programming environment for heterogeneous architectures.[+][+][+]


A set of DOE-developed software tools, sometimes in collaboration with other
funding agencies (DARPA, NSF), that make it easier for programmers to write
high performance scientific applications for high-end computers.[+]


A geographical information system to visualize netCDF files via the web. The
software consists of a server side C++ application and a client side
JavaScript application. The software provides several features to access and
visualize data over the web, it uses OGC standards for data dissemination.[+]


The Advanced Data mining And Machine learning System (ADAMS) is a novel,
flexible workflow engine aimed at quickly building and maintaining real-world,
complex knowledge workflows.[+]


A system of computer programs for solving time dependent, free surface
circulation and transport problems in two and three dimensions. These programs
utilize the finite element method in space allowing the use of highly
flexible, unstructured grids.[+][+]


The Adaptable IO System (ADIOS) provides a simple, flexible way for scientists
to describe the data in their code that may need to be written, read, or
processed outside of the running simulation. By providing an external to the
code XML file describing the various elements, their types, and how you wish
to process them this run, the routines in the host code (either Fortran or C)
can transparently change how they process the data.[+]\~pnorbert/ADIOS-UsersManual-1.5.0.pdf[+]


A software library designed to help rapidly build scalable parallel programs.\~rbutler/adlb/[+]


An adaptive mesh refinement package written in Fortran 90.[+]


An opensource object-oriented Finite Element library which has the ambition to
be generic and efficient. Akantu is developed within the LSMS (Computational
Solid Mechanics Laboratory, lsms., where research is conducted at the
interface of mechanics, material science, and scientific computing. The
open-source philosophy is important for any scientific software project
evolution. The collaboration permitted by shared codes enforces sanity when
users (and not only developers) can criticize the implementation details.
Akantu was born with the vision to associate genericity, robustness and
efficiency while benefiting the open-source visibility.[+]


Implementations of a few algorithms and datastructures for fun and profit.[+]


A software package providing a series of algorithms for statistical relational learning and probabilistic logic inference, based on the Markov logic representation.


The Adaptive Mesh generator for Atmospheric and Ocean Simulation is a mesh generator for adaptive algorithms. It is capable of handling complex geometries as well as highly non-uniform refinement regions. It has a relatively simple programming interface and incorporates some optimization. There is even a 3D version of amatos.


The Adaptive Message Passing Interface is an implementation of MPI that supports dynamic load balancing and multithreading for MPI applications.


A machine independent parallel programming system. Programs written using this system will run unchanged on MIMD machines with or without a shared memory. It provides high-level mechanisms and strategies to facilitate the task of developing even highly complex parallel applications.


The Astrophysical Multipurpose Software Environment provides a software framework astrophysical simulations, in which existing codes from different domains, such as stellar dynamics, stellar evolution, hydrodynamics and radiative transfer can be easily coupled. AMUSE uses Python to interface with existing numerical codes. The AMUSE interface handles unit conversions, provides consistent object oriented interfaces, manages the state of the underlying simulation codes and provides transparent distributed computing.


Python bindings for the Armadillo matrix library.


A C++ linear algebra library.


A collection of Fortran77 subroutines designed to solve large scale eigenvalue problems.



A free open-source software program for solving small to very large mathematical models. ASCEND can solve systems of non-linear equations, linear and nonlinear optimisation problems, and dynamic systems expressed in the form of differential/algebraic equations.


A SEJITS implementation for Python. Asp is a research prototype and implementation of SEJITS (Selective, Embedded Just-in-Time Specialization) for Python. With the aid of application-specific specializers, it compiles fragments of Python down to low-level parallelized CPU and GPU implementations.


A Python web framework that makes the most of the filesystem. Simplates are Cacti

A complete network graphing solution designed to harness the power of
RRDTool's data storage and graphing functionality. Cacti provides a fast
poller, advanced graph templating, multiple data acquisition methods, and user
management features out of the box. All of this is wrapped in an intuitive,
easy to use interface that makes sense for LAN-sized installations up to
complex networks with hundreds of devices.[+]


The OpenSource industry standard, high performance data logging and graphing
system for time series data. RRDtool can be easily integrated in shell
scripts, perl, python, ruby, lua or tcl applications.[+]


Cactus is an open source problem solving environment designed for scientists
and engineers. Its modular structure easily enables parallel computation
across different architectures and collaborative code development between
different groups. Cactus originated in the academic research community, where
it was developed and used over many years by a large international
collaboration of physicists and computational scientists.

The name Cactus comes from the design of a central core ("flesh") which
connects to application modules ("thorns") through an extensible interface.
Thorns can implement custom developed scientific or engineering applications,
such as computational fluid dynamics. Other thorns from a standard
computational toolkit provide a range of computational capabilities, such as
parallel I/O, data distribution, or checkpointing.

Cactus runs on many architectures. Applications, developed on standard
workstations or laptops, can be seamlessly run on clusters or supercomputers.
Cactus provides easy access to many cutting edge software technologies being
developed in the academic research community, including the Globus
Metacomputing Toolkit, HDF5 parallel file I/O, the PETSc scientific library,
adaptive mesh refinement, web interfaces, and advanced visualization tools.[+][+]


A computer algebra system (CAS) designed specifically for the solution of
problems encountered in field theory. It has extensive functionality for
tensor computer algebra, tensor polynomial simplification including multi-term
symmetries, fermions and anti-commuting variables, Clifford algebras and Fierz
transformations, implicit coordinate dependence, multiple index types and many
more. The input format is a subset of TeX. Both a command-line and a graphical
interface are available.[+][+]


A framework for convolutional neural network algorithms, developed with speed in mind. Caffe aims to provide computer vision scientists and practitioners with a clean and modifiable implementation of state-of-the-art deep learning algorithms. For example, network structure is easily specified in separate config files, with no mess of hard-coded parameters in the code. At the same time, Caffe fits industry needs, with blazing fast C++/CUDA code for GPU computation.

Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU (≈ 2.5 ms per image). By separating model representation from actual implementation, Caffe allows experimentation and seamless switching among platforms for ease of development and deployment from prototyping machines to cloud environments. Caffe is maintained and developed by the Berkeley Vision and Learning Center (BVLC) with the help of an active community of contributors on GitHub. It powers ongoing research projects, large-scale industrial applications, and startup prototypes in vision, speech, and multimedia.


The Cameleon language is a graphical data flow language following a two-scale paradigm. It allows an easy up-scale that is the integration of any library writing in C++ in the data flow language. Cameleon language aims to democratize macro-programming by an intuitive interaction between the human and the computer where building an application based on a data-process and a GUI is a simple task to learn and to do. Cameleon language allows conditional execution and repetition to solve complex macro-problems. In this paper we introduce a new model based on the extension of the petri net model for the description of how the Cameleon language executes a composition.



A set of libraries for performing various tasks in radoi astronomy.


Python bindings for the casacore radio astronomy libraries.


A simple, portable, high-performance, scalable, and robust communication interface for HPC and Data Centers. Targeted towards high performance computing (HPC) environments as well as large data centers, CCI can provide a common network abstraction layer (NAL) for persistent services as well as general interprocess communication. In HPC, MPI is the de facto standard for communication within a job. Persistent services such as distributed file systems, code coupling (e.g. a simulation sending output to an analysis application sending its output to a visualization process), health monitoring, debugging, and performance monitoring, however, exist outside of scheduler jobs or span multiple jobs. In these cases, these persistent services tend to use either BSD sockets for portability to avoid having to rewrite the applications for each new interconnect or they implement their own NAL which takes developer time and effort. CCI can simplify support for these persistent services by providing a common NAL which minimizes the maintenance and support for these services while providing improved performance (i.e. reduced latency and increased bandwidth) compared to Sockets.


An asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well. The execution units, called tasks, are executed concurrently on a single or more worker servers using multiprocessing, Eventlet, or gevent. Tasks can execute asynchronously (in the background) or synchronously (wait until ready).


Robust messaging for applications.


An advanced key-value store often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets.


Cello is a GNU99 C library which brings higher level programming to C.


A package of high-resolution central schemes for nonlinear conservation laws and related problems.


A compiler infrastructure for the source-to-source transformation of software programs.


A JavaScript library for creating 3D and 2D maps in a web browser without a plugin. The features include:

  • Create data-driven time-dynamic scenes using CZML. See the CZML Guide.

  • Visualize high-resolution worldwide Terrain. See STK World Terrain.

  • Layer imagery from multiple sources, including WMS, TMS, WMTS, OpenStreetMap, Bing Maps, ArcGIS MapServer, Google Earth Enterprise, and standard image files. Each layer can be alpha-blended with the layers below it, and its brightness, contrast, gamma, hue, and saturation can be dynamically changed.

  • Draw GeoJSON and TopoJSON.

  • Draw 3D models using COLLADA and glTF with animations and skins.

  • Draw and style a wide range of geometries

  • Draw the atmosphere, sun, sun lighting, moon, stars, and water.

  • Individual object picking.

  • Camera navigation with mouse and touch handlers for rotate, zoom, pan with inertia, flights, free look, and terrain collision detection.

  • Batching, culling, and JavaScript and GPU optimizations for performance.

  • Precision handling for large view distances (avoiding z-fighting) and large world coordinates (avoiding jitter)


A library to read and write CZML files for Cesium


Implements the CF data model for the reading, writing and processing of data and its metadata.


A Python interface to UNIDATA’s Udunits-2 package with CF extensions.

Build to Order BLAS

The Build to Order BLAS system is a compiler that generates high-performance implementations of basic linear algebra kernels.

The term BLAS in the name is for Basic Linear Algebra Subprograms. The BLAS is a standard API for important linear algebra operations. The BLAS are implemented by most hardware vendors. Traditionally, each routine in the BLAS is implemented by hand by a highly skilled programmer. The Build to Order BLAS compiler automates the implementation of not only the BLAS standard but also any sequence of basic linear algebra operations.

The user of the Build to Order BLAS compiler writes down a specification for a sequence of matrix and vector operations together with a description of the input and output parameters. The compiler then tries out many different choices of how to implement, optimize, and tune those operations for the user’s computer hardware. The compiler choices the best option, which is output as a C file containing a function that implements the specified operations.


Demonstrates the theory of convolution underlying engineering systems and signal analysis. Designed to enhance the learning experience, C-Graph features an attractive array of scalable pulses, periodic, and aperiodic signal types of variable frequency fundamental to the study of systems theory. The package displays the spectra of any two waveforms chosen by the user, computes their linear convolution, then compares their circular convolution according to the convolution theorem. Each signal is modelled by a register of N discrete values (samples), and the discrete Fourier Transform (DFT) computed by the Fast Fourier Transform (FFT). Students of signal and systems theory will find GNU C-Graph to be of value in visualizing convolution.


A versatile genetic programming application which includes a command-line client and an interactive console mode. It features built in input-output mapping support, and is user-extensible for complex fitness evaluation in Python and Lisp.


Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora’s capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.

Cilk Plus

Adds simple language extensions to the C and C++ languages to express task and data parallelism. These language extensions are powerful, yet easy to apply and use in a wide range of applications.

This was an MIT research program that got folded into the commercially available Intel C++ Compiler Suite. There is a branch of the GCC compiler development stack that’s also in the process of including Cilk.


The main objectives of the METAFOR project were to develop and promulgate an ipso-facto standard for describing climate models and associated data. This standard has been formalized and named the Common Information Model (CIM). Adoption of the CIM standard will allow the climate science community to nurture an eco-system of CIM compliant tools and services to be integrated into the day to day activities of climate research institutes worldwide. The CIM is an ontology, i.e. an informational model describing a particular domain (i.e. climate science). Such a model is formed using a construct known as a class (e.g. simulation). Classes form relationships with other classes (e.g. a simulation has data). Related classes are grouped into packages. The CIM is formally defined using the Unified Modelling Language.


The Coupled-Layer Architecture for Robotic Autonomy is a reusable robotic software framework. CLARAty is a framework that promotes reusable robotic software. It was designed to support heterogeneous robotic platforms and integrate advanced robotic capabilities from multiple institutions. Consequently, its design had to be portable, modular, flexible and extendable.


A lightweight Clifford algebra template library.


A Python-based software component toolkit providing a flexible problem-solving environment for climate science problems. CliMT consists of two layers: a library of climate modeling components (radiative and convective schemes, dynamical cores etc.), mostly in Fortran; and a Python superstructure providing standardized access to each component and allowing coupling of components to form time-dependent models.


Robustly detects extremes against a time-dependent background in climate and weather time series.


A freely* available software tool for 3D visualizations and scientific calculations that was conceived and written by Dr. Christian Perwass. CLUCalc interprets a script language called ‘CLUScript’, which has been designed to make mathematical calculations and visualisations very intuitive.


An implementation of the constrained natural element method in 2D and 3D. It is written in C++ and has Python and Matlab wrappers.

Coarray Fortran

A SPMD parallel programming model based on a small set of language extensions to Fortran 90. CAF supports access to non-local data using a natural extension to Fortran 90 syntax, lightweight and flexible synchronization primitives, pointers, and dynamic allocation of shared data. An executing CAF program consists of a static collection of asynchronous process images.Rice’s implementation of Coarray Fortran 2.0 is a work in progress. We are working to create an open-source, portable, retargetable, high-quality CAF 2.0 compiler suitable for use with production codes. To achieve portability, our compiler performs a source-to-source translation from CAF to Fortran 90 with calls to our CAF 2.0 runtime library primitives. Our CAF compiler’s generated code can be compiled by any Fortran 90 compiler that supports Cray pointers. To achieve high performance, we generate Fortran 90 that is readily optimizable by vendor compilers. Our CAF 2.0 runtime library uses UC Berkeley’s GASNet library as a substrate for communication. GASNet’s get and put operations are used to read and write remote coarray elements. GASNet’s active message support is used to invoke operations on remote nodes. This capability is used to form teams and to look up information about remote coarrays so that process images can read and write them directly.


The Common Data Access toolbox (CODA) provides a set of interfaces for reading remote sensing data from earth observation data files. These interfaces consist of command line applications, libraries, interfaces to scientific applications (such as IDL and MATLAB), and interfaces to programming languages (such as C, Fortran, Python, and Java).

CODA provides a single interface to access data in a wide variety of data formats, including ASCII, binary, XML, netCDF, HDF4, HDF5, GRIB, RINEX, and SP3. This is done by using a generic high level type hierarchy mapping for each data format. For self describing formats such as netCDF, HDF, and GRIB, CODA will automatically construct this mapping based on the file itself. For raw ASCII and binary (and partially also XML) formats CODA makes use of an external format definition stored in .codadef files to determine this mapping. On the download section of this website you will find .codadef files for various earth observation missions that can be used with CODA.


The COllaborative DEvelopment SHell project provides an automatic persistent logbook for sessions of personal command-line work by recording what and how is being done: for private use/reuse and for sharing selected parts with collaborators.


The primary interface for managing Anaconda installations. It can query and search the Anaconda package index and current Anaconda installation, create new Anaconda environments, and install and update packages into existing Anaconda environments.


A scientific tool for the numerical integration of dynamical systems whose mutual couplings are described by a network. Its name is an abbreviation of “Complex Networks Dynamics”.

Conedy supports different dynamical systems with various integration schemes, including ordinary differential equations, iterated maps, stochastic differential equations, and pulse coupled oscillators which are handled via events. In addition, it provides a simple way to handle arbitrary node dynamics. Each dynamical system is associated with a node in a network and edges between such nodes represent couplings. Conedy provides functions to build a network from various node and edge types.

Connectivity Modeling System

A community multiscale modeling system, based on a stochastic Lagrangian framework. It was developed to study complex larval migrations and give probability estimates of population connectivity. In addition, the CMS can also provide a Lagrangian descriptions of oceanic phenomena (advection, dispersion, retention) and can be used in a broad range of applications, from the dispersion and fate of pollutants to marine spatial conservation.


ConTeXt can be used to typeset complex and large collections of documents, like educational materials, user guides and technical manuals. Such documents often have high demands regarding structure, design and accessibility. Ease of maintenance, reuse of content and typographic consistency are important prerequisites. ConTeXt is developed for those who are responsible for producing such documents. ConTeXt is written in the typographical programming language TeX. For using ConTeXt, no TeX programming skills and no technical background are needed. Some basic knowledge of typography and document design will enable you to use the full power of ConTeXt.


A collection of open-source optimization-related Python packages that supports a diverse set of optimization capabilities for formulating and analyzing optimization models.


A data parallel subset of Python which can be dynamically compiled and executed on parallel platforms. Currently, we target NVIDIA GPUs, as well as multicore CPUs through OpenMP and Threading Building Blocks (TBB).


CoreOS is a new Linux distribution that has been rearchitected to provide features needed to run modern infrastructure stacks. The strategies and architectures that influence CoreOS allow companies like Google, Facebook and Twitter to run their services at scale with high resilience.


A generic web service and offline processing tool developed within the Centre for Environmental Data Archival (CEDA). The CEDA OGC web services (COWS) is a set of Python libraries that allow rapid development and deployment of geospatial web applications and services built around the standards managed by the Open Geospatial Consortium [OGC]. A Python software framework for implementing Open Geospatial Consortium web service standards. COWS emphasises rapid service development by providing a lightweight layer of OGC web service logic on top of Pylons [Pylons], a mature web application framework for the Python language. This approach provides developers with a flexible web service development environment without compromising access to the full range of web application tools and patterns: Model-View-Controller paradigm, XML templating, Object-Relational-Mapper integration and authentication/authorisation. COWS contains pre-configured implementations of WMS, WCS and WFS services, a web client and WPS.


A set of libraries providing a comprehensive, efficient and robust softwae toolkit for creating automated astronomical data-reduction tasks.

python-cpl ~~~~

Python interface to CPL.

CPython Compiler Tools

Various compiler tools for Python.


The Community Surface Dynamics Modeling System (CSDMS) deals with the Earth’s surface - the ever-changing, dynamic interface between lithosphere, hydrosphere, cryosphere, and atmosphere. We are a diverse community of experts promoting the modeling of earth surface processes by developing, supporting, and disseminating integrated software modules that predict the movement of fluids, and the flux (production, erosion, transport, and deposition) of sediment and solutes in landscapes and their sedimentary basins.


A library for multidimensional numerical integration. The Cuba library offers a choice of four independent routines for multidimensional numerical integration: Vegas, Suave, Divonne, and Cuhre. All four have a C/C++, Fortran, and Mathematica interface and can integrate vector integrands. Their invocation is very similar, so it is easy to substitute one method by another for cross-checking. For further safeguarding, the output is supplemented by a chi-square probability which quantifies the reliability of the error estimate.


Light-weight Python framework and OLAP HTTP server for easy development of reporting applications and aggregate browsing of multi-dimensionally modeled data.


CUDA™ is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).


A fast C++/CUDA implementation of convolutional (or more generally, feed-forward) neural networks. It can model arbitrary layer connectivity and network depth. Any directed acyclic graph of layers will do. Training is done using the back-propagation algorithm.


A ") is a suite engine and meta-scheduler that specializes in suites of cycling tasks for weather and climate forecasting and related processing (it can also be used for one-off workflows of non-cycling tasks, which is a simpler problem).


A JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation.


A software package for numerical simulation of river hydraulics (2D / 1D). It is designed especially for parameter identification, calibration and variational data assimilation. It is interfaced with few pre and post-processors.


A lightweight data management application developed in Python that primarily targets the management of huge data accumulations, often encountered in the scientific field. The system is able to handle large amounts of data and can be easily integrated in existing working environments. It can be optimised to fit any situation by embedding scripts.


A robust real-time streaming data engine that lets you quickly stream live data from experiments, labs, web cams and even Java enabled cell phones. It acts as a "black box" to which applications and devices send and receive data. Think of it as express delivery for your data, be it numbers, video, sound or text. DataTurbine is a buffered middleware, not simply a publish/subscribe system. It can receive data from various sources (experiments, web cams, etc) and send data to various sinks (visualization interfaces, analysis tools, databases, etc). It has "TiVO" like functionality that lets applications pause and rewind live streaming data.


A crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data.


A novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data structures transparent. It works in perfect harmony with parallelisation mechanism such as multiprocessing and SCOOP.


A pseudospectral solver for fluid equations. Its primary applications are in Astrophysics and Cosmology. Written primarily in python, and making use of the FFTW libraries, Dedalus aims to be a simple, fast, and elegant hydrodynamic and magnetohydrodynamic code.


An open source software for spatial data infrastructures and the geospatial web. deegree includes components for geospatial data management, including data access, visualization, discovery and security. Open standards are at the heart of deegree. The software is built on the standards of the Open Geospatial Consortium (OGC) and the ISO Technical Committee 211. It includes the OGC Web Map Service (WMS) reference implementation, a fully compliant Web Feature Service (WFS) as well as packages for Catalogue Service (CSW), Web Coverage Service (WCS), Web Processing Service (WPS) and Web Map Tile Service (WMTS).


A modeling suite to investigate hydrodynamics, sediment transport and morphology and water quality for fluvial, estuarine and coastal environments. The FLOW module is the heart of Delft3D and is a multi-dimensional (2D or 3D) hydrodynamic (and transport) simulation programme which calculates non-steady flow and transport phenomena resulting from tidal and meteorological forcing on a curvilinear, boundary fitted grid or sperical coordinates. In 3D simulations, the vertical grid is defined following the so-called sigma coordinate approach or Z-layer approach. The MOR module computes sediment transport (both suspended and bed total load) and morphological changes for an arbitrary number of cohesive and non-cohesive fractions. Both currents and waves act as driving forces and a wide variety of transport formulae have been incorporated. For the suspended load this module connects to the 2D or 3D advection-diffusion solver of the FLOW module; density effects may be taken into account. An essential feature of the MOR module is the dynamic feedback with the FLOW and WAVE modules, which allow the flows and waves to adjust themselves to the local bathymetry and allows for simulations on any time scale from days (storm impact) to centuries (system dynamics). It can keep track of the bed composition to build up a stratigraphic record. The MOR module may be extended to include extensive features to simulate dredging and dumping scenarios.


Dexy was created out of a desire to unify software documentation and scientific document automation, resulting in a tool that is better at both of these than anything that has gone before.


A ightweight job execution control framework for parallel scientific applications. DIANE improves the reliability and efficiency of job execution by providing automatic load balancing, fine-grained scheduling and failure recovery. DIANE provides an environment in which the existing applications may be more easily ported to heterogenous computing environments such as the Grid, batch farms or interactive clusters. The default scheduling plugin algorithms are suited for bag of tasks applications and data-parallel problems with no inter-task communication. However the framework is designed to make it easy to plug in other scheduling algorithms for more complex task synchronization patterns and workflows, for example DAG4DIANE plugin provides support for directed acyclic graph (DAG) applications, MOTEUR plugin provides support for workflow applications.


An EOF-based method to fill in missing data from geophysical fields, such as clouds in sea surface temperature.


A C++ library for computing persistent homology.


A lightweight, open-source framework for distributed computing based on the MapReduce paradigm.


A version of NumPy that parallelizes array operations in a manner completely transparent to the user - from the perspective of the user, the difference the next.


The Distributed and Unified Numerics Environment is a modular toolbox for solving partial differential equations (PDEs) with grid-based methods. It supports the easy implementation of methods like Finite Elements (FE), Finite Volumes (FV), and also Finite Differences (FD).


A project to develop a new dynamical core for LMD-Z, the atmospheric general circulation model (GCM) part of IPSL-CM Earth System Model.


An an open architecture, open source public software for data acquisition, processing, archival and distribution. Originally developed by the United States Geological Survey, Earthworm binaries and source files are freely available to everyone.


Python wrapper for accessing an Earthworm shared memory ring.


A visual analytics tool for exploring multivariate data sets. EDEN helps you see the associations among variables for guided analysis. EDEN harnesses the parallel coordinates visualization technique and is augmented with graphical indicators of key descriptive statistics.


A C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.


Elixir is a functional, meta-programming aware language built on top of the Erlang VM. It is a dynamic language with flexible syntax and macro support that leverages Erlang’s abilities to build concurrent, distributed and fault-tolerant applications with hot code upgrades.

Ellipsoidal Potential Theory

Open-source (BSD) implementations of ellipsoidal harmonic expansions for solving problems of potential theory using separation of variables.


An open source multiphysical simulation software mainly developed by CSC - IT Center for Science (CSC). Elmer development was started 1995 in collaboration with Finnish Universities, research institutes and industry. After it’s open source publication in 2005, the use and development of Elmer has become international.

Elmer includes physical models of fluid dynamics, structural mechanics, electromagnetics, heat transfer and acoustics, for example. These are described by partial differential equations which Elmer solves by the Finite Element Method (FEM).


An extensible, pure-Python implementation of Goodman & Weare’s Affine Invariant Markov chain Monte Carlo (MCMC) Ensemble sampler. It’s designed for Bayesian parameter estimation.


Empirical Mode Decomposition is an algorithm that finds common rotational modes among all the channels of n-channel data, and is a generic multidimensional extension of the standard EMD.


This open source, digitizing software converts an image file showing a graph or map, into numbers. The image file can come from a scanner, digital camera or screenshot. The numbers can be read on the screen, and written or copied to a spreadsheet.

The process starts with an image file containing a graph or map. The final result is digitized data that can be used by other tools such as Microsoft Excel and Gnumeric.


The EnKF is a sophisticated sequental data assimilation method. It applies an ensemble of model states to represent the error statistics of the model estimate, it applies ensemble integrations to predict the error statistics forward in time, and it uses an analysis scheme which operates directly on the ensemble of model states when observations are assimilated. The EnKF has proven to efficiently handle strongly nonlinear dynamics and large state spaces and is now used in realistic applications with primitive equation models for the ocean and atmosphere.

Enthought Tool Suite

A suite of Python tools for constructing custom scientific applications.


A Python plotting application toolkit that facilitates writing plotting applications at all levels of complexity, from simple scripts with hard-coded data to large plotting programs with complex data interrelationships and a multitude of interactive tools. While Chaco generates attractive static plots for publication and presentation, it also works well for interactive data visualization and exploration.


Enaml is Not A Markup Language. Enaml is a library for creating professional quality user interfaces with minimal effort. Enaml combines a domain specific declarative language with a constraints based layout system to allow users to easily define rich UIs with complex and flexible layouts. Enaml applications can transparently run on multiple backends (Qt and Wx) and on multiple operating systems.


A Python package for 3D scientific visualization. The project includes Mayavi, a tool for easy, interactive visualization of data that’s integrated with Python scientific libraries, and TVTK, a Traits-based wrapper for VTK.


A trait is a type definition that can be used for normal Python object attributes, giving the attributes some additional characteristics such as initializatino, validation, delegation, notification and visualization. The Traits package was developed to address some of the problems caused by not having declared variable types, in those cases where problems might arise.

Envisat CFI

A collection of precompiled C libraries for timing, coordinate conversions, orbit propagation, satellite pointing calculations, and target visibility calculations. This software is made available by the Envisat project to any user involved in the Envisat mission preparation/exploitation.


The Earth Observation CFI software is a collection of precompiled C libraries for timing, coordinate conversions, orbit propagation, satellite pointing calculations, and target visibility calculations. This software is made available by the EOP system support division to any user involved in the Earth Observation missions preparation/exploitation. As of version 4.0, the Earth Observation CFI Software is available both as C and C++ precompiled libraries and Java libraries.


A server for earth observation data. EOxServer implements the OGC Implementation Specifications EO-WCS and EO-WMS on top of MapServer’s WCS and WMS implementations. EOxServer is released under the EOxServer Open License also a MIT-style license and written in Python and entirely based on Open Source software including MapServer, Django, GDAL, SpatiaLite, or PostGIS, and PROJ.4. The functionality includes:

  • Support of GML AP – Coverages for RectifiedGridCoverages

  • Support of adopted WCS 2.0 specification (Core including GetCapabilities, DescribeCoverage, and GetCoverage requests, KVP-, and XML/POST protocol binding)

  • Anticipated support of envisaged extensions: Coverage format, GeoTIFF encoding, predefined (or EPSG) CRSs, scaling & interpolation, and non-referenced access. By "anticipating" we mean to reflect the latest WCS.SWG discussions as well as to follow the relevant parts of the previous 1.1 and 1.0 versions of WCS.

  • Support of 2-D EO Coverages derived from gmlcov:RectifiedGridCoverage

  • Support of 2-D EO Coverages derived from gmlcov:ReferenceableGridCoverage

  • Support of Dataset Series as a collection of EO Coverages e.g. in a time series

  • Support of new DescribeEOCoverageSet operation on Dataset Series and EO Coverages

  • Support of Stitched Mosaic of Rectified EO Coverages including concept of contributingFootprint

  • Support of EO Metadata (retrieval and evaluation in DescribeEOCoverageSet operation

  • Support of KVP and XML/POST protocol bindings

  • Support of GeoTIFF and GDAL library coverage formats

  • Support of EO-WMS for EO coverages


A programming tool for implementing mathematical models in python using the finite element method (FEM). As users do not access the data structures it is very easy to use and scripts can run on desktop computers as well as highly parallel supercomputer without changes. Application areas for escript include earth mantle convection, geophysical inversion, earthquakes, porous media flow, reactive transport, plate subduction, erosion, and tsunamis.

Escript is designed as an easy-to-use environment for implementing mathematical models based on non-linear, coupled, time-dependent partial differential equations. It uses the finite element method (FEM) for spatial discretization and data representation. Escript is used through python and is suitable for rapid prototyping (e.g for a student project or thesis) as well as for large software projects. Scripts are executed in parallel using MPI, OpenMP and hybrid mode processing over 50 million unknowns on several thousand cores on a parallel computer.


The Earth System Grid Federation (ESGF) Peer-to-Peer (P2P) enterprise system is a collaboration that develops, deploys and maintains software infrastructure for the management, dissemination, and analysis of model output and observational data. ESGF’s primary goal is to facilitate advancements in Earth System Science. ESGF P2P is a component architecture expressly designed to handle large-scale data management for worldwide distribution. The team of computer scientists and climate scientists has developed an operational system for serving climate data from multiple locations and sources. Model simulations, satellite observations, and reanalysis products are all being served from the ESGF P2P distributed data archive.


A tool for publishing scientific dataset (climate data in particular) to


Empirical gramians can be computed for linear and nonlinear control systems for purposes of model order reduction (MOR), uncertainty quantification (UQ) or system identification (SYSID). Model reduction using empirical gramians can be applied to the state space, to the parameter space or to both through combined reduction. For state reduction the empirical controllability gramian and the empirical observability gramian, for balanced truncation, are available, or alternatively the empirical cross gramian for direct truncation. For parameter reduction, parameter identification and sensitivity analysis the empirical sensitivity gramian (controllability of parameters) or the empirical identifiability gramian (observability of parameters) are provided. Combined state and parameter reduction is enabled by the empirical joint gramian, which computes controllability and observability of states and parameter concurrently. The emgr framework is a compact open source toolbox for (empirical) GRAMIAN-based model reduction and compatible with OCTAVE and MATLAB.



The Earth System Modeling Framework (ESMF) collaboration is building high-performance, flexible software infrastructure to increase ease of use, performance portability, interoperability, and reuse in climate, numerical weather prediction, data assimilation, and other Earth science applications. The ESMF defines an architecture for composing complex, coupled modeling systems and includes data structures and utilities for developing individual models.


The Common Information Model (CIM) is a metadata standard used by the climate research community and others to describe the artifacts and processes they work with. This includes climate simulations, the specific model components used to run those simulations, the datasets generated by those components, the geographic grids upon which those components and data are mapped, the computing platforms used, and so on.


Earth System CoG is a web environment that enables users to create project workspaces, connect projects into networks, share and consolidate information within those networks, and seamlessly link to tools for data archival, reformatting and search, data visualization, and metadata collection and display. CoG is integrated with the Earth System Grid Federation (ESGF) data distribution software and provides an easy to use interface to its services.


ES-DOC is an international effort to develop tools to describe Earth system models in order to better understand and utilize model data. The tools are based on the Common Information Model (CIM) standard.


The Earth System Grid Federation (ESGF) Peer-to-Peer (P2P) enterprise system is a collaboration that develops, deploys and maintains software infrastructure for the management, dissemination, and analysis of model output and observational data.

ESMF Web Services

The option to implement a variety of models as web services was implemented in the Earth System Modeling Framework (ESMF). ESMF is based on the idea of components, which may represent physical domains such as the atmosphere, ocean, or cryosphere, or specific processes such as ocean biogeochemistry. These components have a standard interface that includes a specification of input fields, output fields, and time information. When running on high performance computing systems, ESMF components are usually called as subroutines of a main program. With ESMF web services, the components can be run on multiple computer systems, and can communicate with each other through web protocols.

ESMF web services are currently comprised of a set of SOAP (Simple Object Access Protocol) interfaces implemented using a combination of Apache Tomcat, Axis2, and custom Java classes. The SOAP services provide the gateway between the ESMF components and the Internet.


ESMPy is a Python interface to the Earth System Modeling Framework (ESMF) regridding utility.

ESMF is software for building and coupling weather, climate, and related models. It has a robust, parallel and scalable remapping package, used to generate remapping weights. It can handle a wide variety of grids and options: logically rectangular grids and unstructured meshes; regional or global grids; 2D or 3D; and pole and masking options. ESMF also has capabilities to read grid information from NetCDF files in a variety of formats, including the evolving Climate and Forecast (CF) GridSpec and UGRID conventions. It is currently being merged with the OpenClimateGIS package so that it can also support Geographic Information System (GIS) data formats.

ESMPy supports a single-tile logically rectangular discretization type called Grid and an unstructured discretization type called Mesh (ESMF also supports observational data streams). ESMPy supports bilinear, finite element patch recovery and first-order conservative interpolation methods. There is also an option to ignore unmapped destination points and mask out points on either the source or destination. Regridding on the sphere takes place in 3D Cartesian space, so the pole problem is not an issue as it can be with other Earth system grid remapping software. Grid and Mesh objects can be created in 2D or 3D space, and 3D first-order conservative regridding is fully supported. Future plans for ESMPy involve the incorporation of observational data streams and time operations, in addtion to the GIS formats mentioned previously.


The National Unified Operational Prediction Capability (NUOPC) is a consortium of Navy, NOAA, and Air Force modelers and their research partners. It aims to advance the weather prediction modeling systems used by meteorologists, mission planners, and decision makers. NUOPC partners are working toward a common model architecture - a standard way of building models - in order to make it easier to collaboratively build modeling systems. To this end, they have developed a NUOPC Layer that defines conventions and templates for using the Earth System Modeling Framework (ESMF).


OpenClimateGIS (OCGIS) is a Python package designed for geospatial manipulation, subsetting, computation, and translation of climate datasets stored in local NetCDF files or files served through THREDDS data servers. OpenClimateGIS has a straightforward, request-based API that is simple to use yet complex enough to perform a variety of computational tasks. The software is built entirely from open source packages. ClimateTranslator is a new web interface to the OpenClimateGIS functionality. OpenClimateteGIS is currently being merged with high performance parallel grid remapping capabilities from ESMF, through the ESMPy package.


ESMF Python interface.


A computer language devoted to elementary plane geometry. It aims to be a fairly comprehensive system to create geometric figures, either static or dynamic. Eukleides allows to handle basic types of data: numbers and strings, as well as geometric types of data: points, vectors, sets (of points), lines, circles and conics.

A Eukleides script usually consists in a declarative part where objects are defined, and a descriptive part where objects are drawn. Nevertheless, Eukleides is also a full featured programming language, providing conditional and iterative structures, user defined functions, modules, etc. Hence, it can easily be extended.

The Eukleides distribution mainly provides two interpreters: eukleides and euktopst. The former produces Encapsulated PostScript (EPS) files. It can also, using a converter, yield animated GIFs. The later produces PSTricks macros. It enables to include Eukleides figures into LaTeX documents.


EULAG is a numerical solver for all-scale geophysical flows. The underlying anelastic equations are either solved in an EULerian (flux form), or a LAGrangian (advective form) framework.

EULAG model is an ideal tool to perform numerical experiments in a virtual laboratory with time-dependent adaptive meshes and within complex, and even time-dependent model geometries. These abilities are due to the unique model design that combines the nonoscillatory forward-in-time (NFT) numerical algorithms and a robust elliptic solver with generalized coordinates. The code is written as a research tool with numerous options controlling the numerical accuracy and to allow for a wide range of numerical sensitivity tests. These capabilities give the researcher confidence in the numerical solutions of his/her problem. The formulation of the model equations allow for various derivatives of the code including codes for stellar atmospheres, ocean currents, sand dune propagation or biomechanical flows. EULAG is a fully parallelized code and is easily portable between different platforms.


A program for quickly and interactively computing with real and complex numbers and matrices, or with intervals, in the style of MatLab, Octave,…​ It can draw and animate your functions in two and three dimensions.


A software tool for detecting equations and hidden mathematical relationships in your data. Its goal is to identify the simplest mathematical formulas which could describe the underlying mechanisms that produced the data.




An extension module for Python which implements a optimized, register machine based interpreter, inside of your interpreter. You specify which functions you want Falcon to wrap (or your entire module), and Falcon takes over execution from there.


Falkon aims to enable the rapid and efficient execution of many tasks on large compute clusters, and to improve application performance and scalability using novel data management techniques.


Fast Artificial Neural Network Library is a free open source neural network library, which implements multilayer artificial neural networks in C with support for both fully connected and sparsely connected networks. Cross-platform execution in both fixed and floating point are supported. It includes a framework for easy handling of training data sets. It is easy to use, versatile, well documented, and fast. Bindings to more than 15 programming languages are available.


A data processing library that offers a set of searching functions supported by compressed bitmap indexes. The key technology underlying the FastBit software is a set of compressed bitmap indexes. In database systems, an index is a data structure to accelerate data accesses and reduce the query response time. Most of the commonly used indexes are variants of the B-tree, such as B+-tree and B*-tree. FastBit implements a set of alternative indexes called compressed bitmap indexes. Compared with B-tree variants, these indexes provide very efficient searching and retrieval operations, but are somewhat slower to update after a modification of an individual record.


Python bindings for FastBit.


A C++ library for hierarchical, agglomerative clustering. It provides a fast implementation of the most efficient, current algorithms when the input is a dissimilarity index. Moreover, it features memory-saving routines for hierarchical clustering of vector data. It improves both asymptotic time complexity (in most cases) and practical performance (in all cases) compared to the existing implementations in standard software: several R packages, MATLAB, Mathematica, Python with SciPy.


A Matlab program that implements the fast fixed-point algorithm for independent component analysis and projection pursuit. It features an easy-to-use graphical user interface, and a computationally powerful algorithm.

The FastICA algorithm is also available in the MDP and scikit-learn packages.

Fatiando a Terra

A Python toolkit for geophysical modeling and inversion.


FCM uses Subversion for code management but defines a common process and naming convention to simplify usage. It adds a layer on top of Subversion to provide a more natural and user-friendly interface. FCM features a powerful build system, mainly aimed at building modern Fortran software applications.


An API for manipulating, defining and analyzing geospatial information regardless of where it is stored. FDO uses a provider-based model for supporting a variety of geospatial data sources, where each provider typically supports a particular data format or data store.


A free high-performance numerical library for solving the standard or generalized eigenvalue problem.


A collection of free software with an extensive list of features for automated, efficient solution of differential equations.


A geometric multigrid solver for FEniCS.


Developing a new probabilistic model requires developing a representation for the model and a reasoning algorithm that can draw useful conclusions from evidence, which can be challenging tasks. Furthermore, it can be difficult to integrate a probabilistic model into a larger program.

Figaro is a probabilistic programming language that helps address both these issues. Figaro makes it possible to express probabilistic models using the power of programming languages, giving the modeler the expressive tools to create all sorts of models. Figaro comes with a number of built-in reasoning algorithms that can be applied automatically to new models. In addition, Figaro models are data structures in the Scala programming language, which is interoperable with Java, and can be constructed, manipulated, and used directly within any Scala or Java program.


The File Interpolation, Manipulation and EXtraction library for gridded geospatial data, written in C/C++. It converts between different, extensible dataformats (currently netcdf, NcML, grib1/2 and felt). It enables you to change the projection and interpolation of scalar and vector grids. It makes it possible to subset the gridded data and to extract only parts of the files.


Python interfaces to functions in OGR, a library for reading and writing geographic vector data.

See also keytree, Shapely and Rtree.


The objective of the FLAME project is to transform the development of dense linear algebra libraries from an art reserved for experts to a science that can be understood by novice and expert alike. Rather than being only a library, the project encompasses a new notation for expressing algorithms, a methodology for systematic derivation of algorithms, Application Program Interfaces (APIs) for representing the algorithms in code, and tools for mechanical derivation, implementation and analysis of algorithms and implementations.


A software framework for instantiating high-performance BLAS-like dense linear algebra libraries.


A high performance dense linaer algebra library that is the result of the FLAME methodology for systematically developing dense linear algebra libraries.


Extends C++ for matrix/vector types ideally suited for numerical linear algebra.


A large-scale, open source fluid simulator for the CPU and GPU using the smooth particle hydrodynamics method. Fluids is capable of efficiently simulating up to 8 million particles on the GPU (on 1500 MB of ram).


A software framework for supporting the efficient development, construction, execution, and scientific interpretation of atmospheric, oceanic, and climate system models.


An OpenMP runtime compatible with GCC 4.2, offering a structured way to efficiently execute OpenMP applications onto hierarchical (NUMA) architectures.


Marcel is a thread library that was originaly developped to meet the needs of the PM2 multithreaded environment. Marcel provides a POSIX-compliant interface and a set of original extensions. It can also be compiled to provide ABI-compabiblity with NTPL threads under Linux, so that multithreaded applications can use Marcel without being recompiled. Marcel features a two-level thread scheduler (also called N:M scheduler) that achieves the performance of a user-level thread package while being able to exploit multiprocessor machines. The architecture of Marcel was carefully designed to support a high number of threads and to efficiently exploit hierarchical architectures (e.g. multi-core chips, NUMA machines).


Elemental is open-source software for distributed-memory dense linear algebra.


The Family of Simplified Solver Interfaces is designed for an easy integration and selection of parallel solvers in Fortran codes which make use of compressed sparse row matrix format (CSR). FoSSI contains rather similar interfaces to the most popular and wide spread parallel solver libraries obtainable on the web: PETSC, HYPRE, AZTEC and MUMPS. Furthermore, an interface to the PILUT-library is included together with the PILUT-solver itself.


The Shallow Water equations for Overland Flow solves the shallow water equations using finite volumes.


A phase-resolving, time-stepping Boussinesq model for ocean surface wave propagation in the nearshore. The present version of FUNWAVE is based on the MUSCLE-TVD finite volume scheme together with adaptive Runge Kutta time stepping. The code is parallelized using MPI and has been tested in linux and unix (Mac OS X) environments.


Enables simulation of Boussinesq or shallow water equations. CaFunwave is based on the Funwave.


A prognostic, unstructured-grid, finite-volume, free-surface, 3-D primitive equation coastal ocean circulation model developed by UMASSD-WHOI joint efforts. The model consists of momentum, continuity, temperature, salinity and density equations and is closed physically and mathematically using turbulence closure submodels. The horizontal grid is comprised of unstructured triangular cells and the irregular bottom is preseented using generalized terrain-following coordinates. The General Ocean Turbulent Model (GOTM) developed by Burchard’s research group in Germany (Burchard, 2002) has been added to FVCOM to provide optional vertical turbulent closure schemes.


The Geometric Algebra Algorithms Expression Templates library is a C++ library for evaluating geometric algebra expressions. It offers comfortable implementation and reasonable speed by using expression templates and metaprogramming techniques.

The basic idea of fast Geometric Algebra implementations is to do the grading operations beforehand, so only basic operations on the coordinates are performed at runtime. Gaalet does so by applying the grading operations with Cpp metaprogramming techniques at compile time. These grading operations are incorporated into expression templates, also a metaprogramming technique, which offers C++ compilers a good starting point for code optimization as well as programmers the concept of lazy evaluation.


Gaalop (Geometic Algebra Algorithms Optimizer) is a software to optimize geometric algebra files. Algorithms can be developed by using the freely available CLUCalc software by Christian Perwass. Gaalop optimizes the algorithm and produces C++, OpenCL, CUDA, CLUCalc or LaTeX output (other output-formats will follow). The optimized code has no more geometric algebra operations and can be run very efficiently on various platforms.


A code generator for geometric algebra. Currently supported languages are C, C++, C# and Java.


A software package for the global numerical analysis of dynamical systems and optimization problems based on set oriented techniques. It may e.g. be used to compute invariant sets, invariant manifolds, invariant measures and almost invariant sets in dynamical systems and to compute the globally optimal solutions of both scalar and multiobjective problems.


The Geo-interface to Atmosphere, Land, Earth, Ocean, NetCDF is an interoperability experiment for implementing and testing clients and servers for WCS gateways to netCDF datasets.


High throughput storage and retrieval of multidimensional data. Time-series data occurs in settings such as observations initiated by radars and satellites, checkpointing data representing state of the system at regular intervals, and analytics representing the evolution of extracted knowledge over time. Galileo is a demonstrably scalable storage framework for managing such time-series data.


A system that automatically executes "Galoized" serial C++ or Java code in parallel on shared-memory machines. It works by exploiting amorphous data-parallelism, which is present even in irregular codes that are organized around pointer-based data structures such as graphs and trees. The Galois system includes the Lonestar benchmark suite and the ParaMeter profiler.

Multicore processors are becoming increasingly the norm. As a result, we need to find ways to make it easier to write parallel programs. Galois allows the programmer to write serial C++ or Java code while still getting the performance of parallel execution. All the programmer has to do is use Galois-provided data structures, which are necessary for correct concurrent execution, and annotate which loops should be run in parallel. The Galois system then speculatively extracts as much parallelism as it can. The current release includes a dozen sample benchmarks applications from a broad range of domains that are written using the Galois extensions and classes.


A language-independent, low-level networking layer that provides network-independent, high-performance communication primitives tailored for implementing parallel global address space SPMD languages such as UPC, Titanium, and Co-Array Fortran. The interface is primarily intended as a compilation target and for use by runtime library writers (as opposed to end users), and the primary goals are high performance, interface portability, and expressiveness. GASNet stands for "Global-Address Space Networking".


Python bindings for GASNet.

Berkeley UPC

An extension of the C programming language designed for high performance computing on large-scale parallel machines.The language provides a uniform programming model for both shared and distributed memory hardware. The programmer is presented with a single shared, partitioned address space, where variables may be directly read and written by any processor, but each variable is physically associated with a single processor. UPC uses a Single Program Multiple Data (SPMD) model of computation in which the amount of parallelism is fixed at program startup time, typically with a single thread of execution per processor.


An object-oriented geophysical and astrophysical spectral-element adaptive refinement code. Like most spectral-element codes, GASpAR combines finite-element efficiency with spectral-method accuracy. It is also designed to be flexible enough for a range of geophysics and astrophysics applications where turbulence or other complex multi-scale problems arise. The formalism accommodates both conforming and non-conforming elements.


A multi-purpose program for performing geometric algebra computations and visualizing geometric algebra.


The Global Cloud Resolving Model.

svn co svn://



Extensions to the SQLAlchemy framework to work with spatial databases. The support database systems include PostGIS and Spatialite.


A Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL. It provides a full suite of well known enterprise-level persistence patterns, designed for efficient and high-performing database access, adapted into a simple and Pythonic domain language.


GeoJModelBuilder couples geosprocessing Web services, NASA World Wind and Sensor Web services to support geoprocessing modeling and environmental monitoring.The main goal of GeoJModelBuilder is to bring an easy-to-use tool to the geoscientific community.

The tool can allow users to drag and drop various geospatial services to visually generate workflows and interact with the workflows in a virtual globe environment. It also allows users to audit trails of workflow executions, check the provenance of data products, and support scientific reproducibility.

The programming language used for the development is Java due to its platform-independent feature. The tool can be operated on any operating systems such as Windows or Unix/Linux that supports Java.


A Python encoder/decoder for simple GIS features using the GeoJSON format.


GeoLearn is designed to enable rapid processing of large size satellite remote sensing data available in HDF EOS format. It has been tested primarily with MODIS land-surface data products. Use and analysis of these datasets are at the heart of a variety of scientific investigations pertaining to the study of the interaction between land-surface and climate, and prediction of terrestrial hydrologic processes.


Adds spatial capabilities to scripting languages, e.g. Python.


GmtPy provides seamless integration of GMT plotting into Python programs. On top of that it provides (in an opt-in fashion): autoscaling, automatic tick increment determination, layout management, and more.



A Python wrapper for the Google Chart API. The wrapper can render the URL of the Google chart, based on your parameters, or it can render an HTML img tag to insert into webpages on the fly. Made for dynamic python websites (Django,Zope,CGI,etc.) that need on the fly chart generation without any extra modules.

Google Maps JavaScript API

The Google Maps Javascript API lets you embed Google Maps in your own web pages. Version 3 of this API is especially designed to be faster and more applicable to mobile devices, as well as traditional desktop browser applications. The API provides a number of utilities for manipulating maps (just like on the web page) and adding content to the map through a variety of services, allowing you to create robust maps applications on your website.


An API for the development of scalable, asynchronous and fault tolerant parallel applications.

GPU Ocelot

An open-source dynamic JIT compilation framework for GPU compute applications targetinga range of GPU and non-GPU execution targets. Ocelot supports CUDA applications and provides animplementation of the CUDA Runtime API enabling seamless integration. NVIDIA’s PTX virtualinstruction set architecture is used as adevice-agnostic program representation that captures the data-parallel SIMT execution model ofCUDA applications. Ocelot supports several backend execution targets – a PTX emulator, NVIDIA GPUs,AMD GPUs, and a translator to LLVM for efficient execution of GPU kernels on multicore CPUs.


Graal is a new experimental just-in-time compiler for Java that is integrated with the HotSpot virtual machine. Its focus is to provide excellent peak performance via new techniques in the area of method inlining, removing object allocations, and speculative execution. The term GraalVM is used to denote a HotSpot virtual machine configured with Graal.

Truffle is a multi-language framework for executing dynamic languages that achieves high performance when combined with Graal. This OTN release includes a Truffle-based JavaScript execution engine that can be used to run JavaScript applications with GraalVM. There are several open source projects building Truffle-based runtimes for other languages, e.g., Ruby (see 'TruffleRuby'), R (see 'FastR'), or Python (see 'ZipPy').

GRACE Software

A collection of hundreds of Matlab scripts, many of which are useful for the geosciences.


A free and open source Geographic Information System (GIS) software suite used for geospatial data management and analysis, image processing, graphics and maps production, spatial modeling, and visualization.


Software for controlling the motion of machines that make things. If the maker movement was an industry, Grbl would be the industry standard.


A high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks.


A scalable data transfer management tool for GridFTP? transfer protocol. The goal is to manage as much as 1+ PB with millions of files transfers reliably.


A C code that provides a command line utility for non-interactive generation of multi-corner quasi-orthogonal grids inside simply connected polygonal regions.It is based on the CRDT algorithm that makes it possible to handle regions with elongated channels in a numerically robust way.


Provides C library functions and command line utilities for working with curvilinear grids. gridutils has been developed and used mainly for grids generated by gridgen, but can be used to handle arbitrary 2D quadrilateral simply connected multi-corner grids.


The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.


Python implementation of the thermodynamic equation of seawater (TEOS-10).


A Python library for generating GUIs for easy dataset editing and display.


The GOCE User Toolbox GUT is a compilation of tools for the utilisation and analysis of GOCE Level 2 products. GUT supports applications in Geodesy, Oceanography and Solid Earth Physics. GUT is a tool to facilitate the use, viewing and post-processing of GOCE Level 2 mission data products for optimal use in the fields of geodesy, oceanography and solid Earth physics. GUT is a command-line processor that has been designed for users at all levels of expertise.


The Geophysical Wavelet Library (GWL) is a software package based on the continuous wavelet transform that allows to perform the direct and inverse continuous wavelet transform, 2C and 3C polarization analysis and filtering, modeling the dispersed and attenuated wave propagation in the time-frequency domain and optimization in signal and wavelet domains with the aim to extract velocities and attenuation parameters from a seismogram. The novelty of this package is that we incorporate the continuous wavelet transform into the library, where the kernel is the time-frequency polarization and dispersion analysis. This library has a wide range of potential applications in the field of signal analysis and may be particularly suitable in geophysical problems that we illustrate by analyzing synthetic, geomagnetic and real seismic data.


A modified version of the Hadoop MapReduce framework, designed to serve these applications. HaLoop not only extends MapReduce with programming support for iterative applications, but also dramatically improves their efficiency by making the task scheduler loop-aware and by adding various caching mechanisms. We evaluate HaLoop on real queries and real datasets and find that, on average, HaLoop reduces query runtimes by 1.85 compared with Hadoop, and shuffles only 4% of the data between mappers and reducers compared with Hadoop.



The ExaScale IO (ESIO) library provides simple, high throughput input and output of structured data sets using parallel HDF5. ESIO is designed to support reading and writing turbulence simulation restart files but it may be useful in other contexts. The library is written in C99 and may be used by C89 or C++ applications. A Fortran API built atop the F2003 standard ISO_C_BINDING is also available.


A Python interface to HDF5.


A set of utilities for visualization and conversion of scientific data in the free, portable HDF5 format. Besides providing a simple tool for batch visualization as PNG images, h5utils also includes programs to convert HDF5 datasets into the formats required by other free visualization software (e.g. plain text, Vis5d, and VTK).


A visual tool for browsing and editing HDF4 and HDF5 files.


A high level interface to the Heirarchical Data Format, version 5, developed and maintained by the HDF group at the National Center for Supercomputing Applications (NCSA), at the University of Illinois at Urbana-Champaign. HDF5 is a file format designed for maximum flexibility and efficiency and it makes use of modern software technology. HDF5 sports such fundamental characteristics as platform independence and efficient built-in compression, and it can be used to store virtually any kind of scientific data. HL-HDF is designed to focus on selected HDF5 functionality and make it available to users at a high level of abstraction to facilitate data management. This distribution contains HL-HDF source code and associated documentation. The first version also comes prebuilt for a multitude of platforms.


The H5hut library is an implementation of several data models for particle-based simulations that encapsulates the complexity of parallel HDF5 and is simple to use, yet does not compromise performance. H5hut is tuned for writing collectively from all processors to a single, shared file. Although collective I/O performance is typically (but not always) lower than that of file-per- processor, having a shared file simplifies scientific workflows in which simulation data needs to be analyzed or visualized. In this scenario, the file-per-processor approach leads to data management headaches because large collections of files are unwieldy to manage from a file system standpoint. On a parallel file system like Lustre, even the ls utility will break when presented with tens of thousands of files, and performance begins to degrade with this number of files because of contention at the metadata server. Often a post-processing step is necessary to refactor file-per-processor data into a format that is readable by the analysis tool. In contrast, H5hut files can be directly loaded in parallel by visualization tools like VisIt and ParaView. H5hut is a veneer API for HDF5: H5hut files are also valid HDF5 files and are compatible with other HDF5-based interfaces and tools. For example, the h5dump tool that comes standard with HDF5 can export H5hut files to ASCII or XML for additional portability. H5hut also includes tools to convert H5hut data to the Visualization ToolKit (VTK) format and to generate scripts for the GNUplot data plotting tool.


An unstructured, high-order, parallel Discontinuous Galerkin (DG) code that I am developing as part my PhD project. hedge’s design is focused on two things: being fast and easy to use. While the need for speed dictates implementation in a low level language, these same low-level languages become quite cumbersome at a higher level of abstraction. This is where the "h" in hedge comes from; it takes a hybrid approach. While a small core is written in C++ for speed, all user-visible functionality is driven from Python.


A C++ library for rapid development of adaptive hp-FEM / hp-DG solvers. Novel hp-adaptivity algorithms help solve a large variety of problems ranging from ODE and stationary linear PDE to complex time-dependent nonlinear multiphysics PDE systems.


A multi-purpose finite element software providing powerful tools for efficient and accurate solution of a wide range of problems modeled by partial differential equations (PDEs). Based on object-oriented concepts and the full capabilities of C++ the HiFlow³ project follows a modular and generic approach for building efficient parallel numerical solvers. It provides highly capable modules dealing with the mesh setup, finite element spaces, degrees of freedom, linear algebra routines, numerical solvers, and output data for visualization. Parallelism – as the basis for high performance simulations on modern computing systems – is introduced on two levels: coarse-grained parallelism by means of distributed grids and distributed data structures, and fine-grained parallelism by means of platform-optimized linear algebra back-ends.


A C++ software package for discontinuous Galerkin method. This framework is intended to those who want to easily develop and apply discontinuous Galerkin methods for various physical problems, especially partial differential equations, arising from fluid mechancis and electro-magnetism. Using HPGEM, one can numerically solve the simplest class room examples such as linear advection and Burgers equations to the most complicated practical examples such as shallow water, Euler, Navier-Stokes and Maxwell equations.


A general purpose C++ runtime system for parallel and distributed applications of any scale. The HPX runtime software package is a modular, feature-complete, and performance oriented representation of the ParalleX execution model targeted at conventional parallel computing architectures such as SMP nodes and commodity clusters. HPX is a C++ library that supports a set of critical mechanisms for dynamic adaptive resource management and lightweight task scheduling within the context of a global address space.


Standards-compliant library for parsing and serializing HTML documents and fragments in Python.


A comprehensive navigational query language for relational databases. HTSQL is designed for data analysts and other accidental programmers who have complex business inquiries to solve and need a productive tool to write and share database queries.


A software platform for creating dynamic web sites that support scientific research and educational activities.


I2P is an anonymous network, exposing a simple layer that applications can use to anonymously and securely send messages to each other. The network itself is strictly message based (a la IP), but there is a library available to allow reliable streaming communication on top of it (a la TCP). All communication is end to end encrypted (in total there are four layers of encryption used when sending a message), and even the end points ("destinations") are cryptographic identifiers (essentially a pair of public keys).


The Ibis Portability layer (IPL) is a communication library specifically designed for usage in a grid environment. It has a number of properties which help to achieve its goal of providing programmers with an easy to use, reliable grid communication infrastructure.


A Java-based software framework for analyzing and visualizing geoscience data. The IDV "reference application" is a geoscience display and analysis software system with many of the standard data displays that other Unidata software (e.g. GEMPAK and McIDAS) provide. It brings together the ability to display and work with satellite imagery, gridded data (for example, numerical weather prediction model output), surface observations, balloon soundings, NWS WSR-88D Level II and Level III RADAR data, and NOAA National Profiler Network data, all within a unified interface. It also provides 3-D views of the earth system and allows users to interactively slice, dice, and probe the data, creating cross-sections, profiles, animations and value read-outs of multi-dimensional data sets. The IDV can display any Earth-located data if it is provided in a known format.


A free, open source, visualization and data analysis software package that is the fifth generation in SSEC’s 40 year history of sophisticated McIDAS (Man computer Interactive Data Access System) software packages. McIDAS-V displays weather satellite (including hyperspectral) and other geophysical data in 2- and 3-dimensions, and can be used to analyze and manipulate the data with its powerful mathematical functions.


A software package for exploration and visualization of Earth-located geoscience data.


A Python library for defining domain specific languages and generating high performance code.


A high performance math library for programmers and scientists. Extending the .NET framework with tools needed for scientific computing, it simplifies the implementation of all kinds of numerical algorithms in convenient, familiar C#-syntax – optimized to the speed of C and FORTRAN.


A public domain Java image processing program. It can display, edit, analyze, process, save and print 8-bit, 16-bit and 32-bit images. It can read many image formats including TIFF, GIF, JPEG, BMP, DICOM, FITS and "raw". It supports "stacks", a series of images that share a single window. It is multithreaded, so time-consuming operations such as image file reading can be performed in parallel with other operations.

ImageJ was designed with an open architecture that provides extensibility via Java plugins. Custom acquisition, analysis and processing plugins can be developed using ImageJ’s built in editor and Java compiler. User-written plugins make it possible to solve almost any image processing or analysis problem.


A distribution of ImageJ (and soon ImageJ2) together with Java, Java 3D and a lot of plugins organized into a coherent menu structure. Fiji compares to ImageJ as Ubuntu compares to Linux.


A project to combine VTK with ImageJ.


A project to develop the next-generation version of ImageJ.


The motivation for developing ImageTools (formerly known as Im2Learn) comes from academic, government and industrial collaborations that involve development of new computer methods and solutions for understanding complex data sets. Images and other types of data generated by various instruments and sensors form complex and highly heterogeneous data sets, and pose challenges on knowledge extraction.

The main goal of the ImageTools research and development is to automate information processing of repetitive, laborious and tedious analysis tasks and build user-friendly decision-making systems that operate in automated or semi-automated mode in a variety of applications. The development is based on theoretical foundations of image and video processing, computer vision, data fusion, statistical and spectral modeling.


A powerful numerical tool for academic research. It can combine the versatility of industrial codes with the accuracy of spectral codes. Thank to a very successful project with NAG and HECToR (UK Supercomputing facility), Incompact3d can be used on up to hundreds of thousands computational cores to solve the incompressible Navier-Stokes equations. This high level of parallelisation is achieved thank to a highly scalable 2D decomposition library and a distributed Fast Fourier Transform (FFT) interface.



A distributed operating system, originally developed at Bell Labs, but now developed and maintained by Vita Nuova® as Free Software. Applications written in Inferno’s concurrent programming language, Limbo, are compiled to its portable virtual machine code (Dis), to run anywhere on a network in the portable environment that Inferno provides. Unusually, that environment looks and acts like a complete operating system.

The use of a high-level language and virtual machine is sensible but mundane. The interesting thing is the system’s representation of services and resources. They are represented in a file-like name hiearchy. Programs access them using only the file operations open, read/write, and close. The files may of course represent stored data, but may also be devices, network and protocol interfaces, dynamic data sources, and services. The approach unifies and provides basic naming, structuring, and access control mechanisms for all system resources. A single file-service protocol (the same as Plan 9’s 9P) makes all those resources available for import or export throughout the network in a uniform way, independent of location. An application simply attaches the resources it needs to its own per-process name hierarchy (name space).

The system can be used to build portable client and server applications. It makes it straightforward to build lean applications that share all manner of resources over a network, without the cruft of much of the Grid software one sees.


The inspyred library grew out of insights from Ken de Jong’s book “Evolutionary Computation: A Unified Approach.” The goal of the library is to separate problem-specific computation from algorithm-specific computation. Any bio-inspired algorithm has at least two aspects that are entirely problem-specific: what solutions to the problem look like and how such solutions are evaluated. These components will certainly change from problem to problem. For instance, a problem dealing with optimizing the volume of a box might represent solutions as a three-element list of real values for the length, width, and height, respectively. In contrast, a problem dealing with optimizing a set of rules for escaping a maze might represent solutions as a list of pair of elements, where each pair contains the two-dimensional neighborhood and the action to take in such a case.

On the other hand, there are algorithm-specific components that may make no (or only modest) assumptions about the type of solutions upon which they operate. These components include the mechanism by which parents are selected, the way offspring are generated, and the way individuals are replaced in succeeding generations. For example, the ever-popular tournament selection scheme makes no assumptions whatsoever about the type of solutions it is selecting. The n-point crossover operator, on the other hand, does make an assumption that the solutions will be linear lists that can be “sliced up,” but it makes no assumptions about the contents of such lists. They could be lists of numbers, strings, other lists, or something even more exotic.

The central design principle for inspyred is to separate problem-specific components from algorithm-specific components in a clean way so as to make algorithms as general as possible across a range of different problems.


Invenio is a free software suite enabling you to run your own digital library or document repository on the web. The technology offered by the software covers all aspects of digital library management from document ingestion through classification, indexing, and curation to dissemination. Invenio complies with standards such as the Open Archives Initiative metadata harvesting protocol (OAI-PMH) and uses MARC 21 as its underlying bibliographic format. The flexibility and performance of Invenio make it a comprehensive solution for management of document repositories of moderate to large sizes (several millions of records).


A scalable, unified high-end computing I/O forwarding software layer.


Gallery of IPython Notebook Themes -

A Gallery of Interesting Python Notebooks

Notebook Gallery


Stores IPython notebooks automagically onto OpenStack clouds through Swift.


JavaScript extensions for IPython notebook.



The integrated Rule-Oriented Data-management System, a community-driven, open source, data grid software solution. It helps researchers, archivists and others manage (organize, share, protect, and preserve) large sets of computer files. Collections can range in size from moderate to a hundred million files or more totaling petabytes of data. This is the open-source successor to SRB.


A global geometric framework for nonlinear dimensionality reduction. A Matlab package is available.


A cross-platform computational fluid dynamics (CFD) library for mesh-free particle based simulation and visualization of incompressible flows using Smoothed Particle Hydrodynamics (SPH) methods. The library is open source and cross-platform, written in pure C++ and the new standard for parallel programming of modern processors - OpenCL. The library will make full use of GPUs, CPUs and other OpenCL enabled devices in running system to accelerate the computing.


A Tile Assembly Model simulator that allows users to design tilesets and seeds and to simulate assemblies. The simulator allows for graphical creation of seed assemblies, fast forwarding and rewinding of assembly growth, and easy zooming, scrolling, and inspection of assemblies among other features. The graphical tile type editor allows tile types to be easily designed and manipulated. Assemblies and tile sets can be created, saved, and reloaded.

Our research is motivated by the prospect, raised by pioneering work of Seeman, Winfree, and Rothemund, of engineering structures that autonomously assemble themselves from molecular components. We are primarily interested in understanding the power and limitations of this "programming of matter". Our work includes the development and analysis of mathematical models of self-assembly, the creation and use of software environments for developing and simulating self-assembly systems, and studies of the self-assembly of fractals and other complex structures. We also work to adapt methods that software engineers have developed for creating, controlling, and reasoning about systems of immense complexity (requirements engineering, programming languages, formal verification, software safety, …​) to the even greater challenges that nanotechnology will confront.


A C++ library of mathematical, signal processing and communication classes and functions. Its main use is in simulation of communication systems and for performing research in the area of communications. The kernel of the library consists of generic vector and matrix classes, and a set of accompanying routines. Such a kernel makes IT++ similar to MATLAB, GNU Octave or SciPy.


Technologies that enable application scientists to easily use multiple mesh and discretization strategies within a single simulation on petascale computers.


Python bindings for ITAPS interfaces.


A code library which provides geometry functionality used for mesh generation and other applications. This functionality includes that commonly found in solid modeling engines, like geometry creation, query and modification; CGMA also includes capabilities not commonly found in solid modeling engines, like geometry decomposition tools and support for shared material interfaces.


MeshKit is an open-source library of mesh generation functionality. Its design philosophy is two-fold: it provides a collection of meshing algorithms for use in real meshing problems, along with other tools commonly needed to support mesh generation (coordination of BREP-based meshing process, mesh smoothing, etc.); and it serves as a platform in which to perform mesh generation algorithm research.

MeshKit has general mesh manipulation and generation functions such as Copy, Move, Rotate and Extrude mesh. In addition, new quad mesh and embedded boundary Cartesian mesh algorithm (EBMesh) are developed to be used. Interfaces to several public-domain tetrahedral meshing algorithms (Gmsh, netgen) are also offered.

This library interacts with mesh data mostly through iMesh including accessing the mesh in parallel. It also can interact with iGeom interface to provide geometry functionality such as importing solid model based geometries. iGeom and iMesh are implemented in the CGM and MOAB packages, respectively. For some non-existing functions in iMesh such as tree-construction and ray-tracing, MeshKit also interacts with MOAB functions directly.


A component for representing and evaluating mesh data. MOAB implements the ITAPS iMesh interface; iMesh is a common interface to mesh data implemented by several different packages, including MOAB. Various tools like smoothing, adaptive mesh refinement, and parallel mesh communication are implemented on top of iMesh.


An attempt to write a Jarvis-like assistant in Python.


J is a modern, high-level, general-purpose, high-performance programming language. J is particularly strong in the mathematical, statistical, and logical analysis of data. It is a powerful tool in building new and better solutions to old problems and even better at finding solutions where the problem is not already well understood.


JHOVE2 is a framework and application for next-generation format-aware characterization of digital objects. The function of JHOVE2 is encapsulated in a series of modules that can be configured for use within the framework’s plug-in architecture. The NetCDF Formatmodule, denominated JANEME: J-NetCDF Metadata Extractor, provides characterization services for the netCDF family of formats consisting of the profiles netCDF-3 and netCDF-4 and for the GRIB family (GRIB 1.0 and 2.0) as well.

JANEME is able to parse and characterize files in NetCDF and GRIB format via the Unidata netcdf-java library 4.1 (Unidata NetCDF-java) and to fill out templates conforming to Dublin Core and a c3grid iso19115 compatible profile with the extracted metadata while supporting JHOVE`s standard output as well. Additionally, it supplies an axis2 web service deployable on any arbitrary Java Application Server, i.e., Tomcat.


A set of Matlab functions for the purpose of analyzing data. It consists of four hundred m-files spanning thirty-five thousand lines of code. JLAB includes functions ranging in complexity from one-line aliases to high-level algorithms for certain specialized tasks. About four hundred automated tests and dozens of scripts for sample figures help keep things organized.

     Jarray     - Vector, matrix, and N-D array tools.
     Jmath      - Mathematical aliases and basic functions.
     Jpoly      - Special polynomials, matrices, and functions.
     Jgraph     - Fine-turning and customizing figures.
     Jstrings   - Strings, files, and variables.
     Jstats     - Statistical tools and probability distributions.
     Jsignal    - Signal processing, wavelet and spectral analysis.
     Jellipse   - Elliptical (bivariate) time series analysis.
     Jcell      - Tools for operating on cell arrays of numerical arrays.
     Vtools     - Operations on multiple data arrays simultaneously.


The Kepler Project is dedicated to furthering and supporting the capabilities, use, and awareness of the free and open source, scientific workflow application, Kepler. Kepler is designed to help scientists, analysts, and computer programmers create, execute, and share models and analyses across a broad range of scientific and engineering disciplines. Kepler can operate on data stored in a variety of formats, locally and over the internet, and is an effective environment for integrating disparate software components, such as merging "R" scripts with compiled "C" code, or facilitating remote, distributed execution of models. Using Kepler’s graphical user interface, users simply select and then connect pertinent analytical components and data sources to create a "scientific workflow"—an executable representation of the steps required to generate results. The Kepler software helps users share and reuse data, workflows, and components developed by the scientific community to address common needs.



Python functions for reading and writing KML.

See also Fiona, Shapely and Rtree.


KGPU is a GPU computing framework for the Linux kernel. It allows Linux kernel to call CUDA programs running on GPUs directly. The motivation is to augment operating systems with GPUs so that not only userspace applications but also the operating system itself can benefit from GPU acceleration. It can also free the CPU from some computation intensive work by enabling the GPU as an extra computing device.


Open source Python library for rapid development of applications that make use of innovative user interfaces, such as multi-touch apps.[]


A user-friendly graphical workbench for the entire analysis process: data access, data transformation, initial investigation, powerful predictive analytics, visualisation and reporting. The open integration platform provides over 1000 modules (nodes), including those of the KNIME community and its extensive partner network.

KNIME can be downloaded onto the desktop and used free of charge. KNIME products include additional functionalities such as shared repositories, authentication, remote execution, scheduling, SOA integration and a web user interface as well as world-class support. Robust big data extensions are available for distributed frameworks such as Hadoop.


An Open-Source framework for the implementation of numerical methods for the solution of engineering problems. It is written in C++ and is designed to allow collaborative development by large teams of researchers focusing on modularity as well as on performance. The Kratos features a "core" and "applications" approach where "standard tools" (databases, linear algebra, search structures, etc…​) come as a part of the core and are available as building blocks in the development of "applications" which focus on the solution of the problems of interest. Its ultimate goal is to simplify the development of new numerical methods.


An extensible XSLT-based framework for extracting RDF from XML, supporting multiple input languages as well as multiple output RDF notations.


Krita is a FREE digital painting and illustration application. Krita offers CMYK support, HDR painting, perspective grids, dockers, filters, painting assistants, and many other features you would expect.


Creates PNG images of mathematical expressions formatted in LaTeX. While it can convert a whole LaTeX document, it is designed to easily generate images from just a fragment of LaTeX code. It depends on other software: latex, dvips, and convert. (The last one is from the ImageMagick graphics toolset.) If you already work with LaTeX on a modern Unix or Linux system, you probably already have all of that installed.


To calculate backward-in-time, finite-size Lyapunov exponents (FSLEs) of the global oceans.


A signal processing oriented command language with matlab-like syntax which includes a high level object-oriented graphic language. It allows to deal with high-level structures such as signals, images, wavelet transforms, extrema representation, short time fourier transform, etc.


In the process of developing the Digital Library of Mathematical Functions, we needed a means of transforming the LaTeX sources of our material into XML which would be used for further manipulations, rearrangements and construction of the web site. In particular, a true ‘Digital Library’ should focus on the semantics of the material, and so we should convert the mathematical material into both content and presentation MathML. At the time, we found no software suitable to our needs, so we began development of LaTeXML in-house.

In brief, latexml is a program, written in Perl, that attempts to faithfully mimic TeX’s behavior, but produces XML instead of dvi. The document model of the target XML makes explicit the model implied by LaTeX. The processing and model are both extensible; you can define the mapping between TeX constructs and the XML fragments to be created. A postprocessor, latexmlpost converts this XML into other formats such as HTML or XHTML, with options to convert the math into MathML (currently only presentation) or images.


The Local Ensemble Transform Kalman Filter is an advanced data assimilation method for many possible applications.


Much computational science deals with the approximate solution of models described by systems of partial differential equations; these are used across the entire breadth of the quantitative sciences. Such a model takes as input the physical state at some initial time and runs forward in time to compute the state at some later time of interest; that is, it maps cause to effect, and so it is referred to as the forward model. For a given forward model, one can associate an adjoint model, which does the opposite: it maps from effect back to cause, and so runs backwards in time. Once an adjoint model is available, it makes possible a number of very powerful techniques: optimise engineering designs, assimilate data from physical measurements, estimate unknown parameters in the forward model, and estimate the approximation error in quantities of interest. Such applications are of huge interest and importance across all of engineering and the quantitative sciences. As computational science moves from mere simulation to optimisation, adjoint modelling will only grow in importance.

The fundamental abstraction of algorithmic differentiation is that it treats the model as a sequence of primitive instructions, each of which may be differentiated in turn and composed using the chain rule. libadjoint explores a similar, but higher-level abstraction: that the model is a sequence of linear solves. In this approach, the model is instrumented with library calls that record what operators it is assembling and what they are being applied to, in an analogous manner to building a tape for reverse-mode AD. The model developer then provides callback routines that compute the action of or assemble these operators to the library. With this information, the library may then assemble the adjoint of each equation solved in the forward model automatically. This promises to make adjointing models significantly easier than it currently is.


A library for state-space modelling and Bayesian inference on high-performance computer hardware, including multi-core CPUs, many-core GPUs (graphics processing units) and distributed-memory clusters. The staple methods of LibBi are based on sequential Monte Carlo (SMC), also known as particle filtering. These methods include particle Markov chain Monte Carlo (PMCMC) and SMC2. Other methods include the extended Kalman filter and some parameter optimisation routines. LibBi consists of a C++ template library, as well as a parser and compiler, written in Perl, for its own modelling language.


Lorenz 96 differential equation model.


A C/C++ library for reading and writing the very common LAS LiDAR format. The ASPRS LAS format is a sequential binary format used to store data from LiDAR sensors and by LiDAR processing software for data interchange and archival.


A LAS reader plugin for ParaView.


A C++ library for dynamic, multidimensional arrays.


Python bindings for LibDyND.


An auto-parallelizing library to speed up your stencil code based computer simulations. It runs on virtually all current architectures, be it multi-cores, GPUs, or large scale MPI clusters.


A project to provide an implementation of the GDAL specification within the the HDF5 file format. Specifically, the format will support raster attribute tables (commonly not included within other formats), image pyramids, GDAL meta-data, in-built statistics while also providing large file handling with compression used throughout the file. Being based on the HDF5 standard, it will also provide a base from which other formats could be derived and will be a good choice for long term data archiving. An independent software library (libKEA) has been provided through which complete access to the KEA image format is provided alongside a GDAL driver allowing KEA images to be used through any GDAL supported software.


The libMesh library provides a framework for the numerical simulation of partial differential equations using arbitrary unstructured discretizations on serial and parallel platforms. A major goal of the library is to provide support for adaptive mesh refinement (AMR) computations in parallel while allowing a research scientist to focus on the physics they are modeling.

libMesh currently supports 1D, 2D, and 3D steady and transient simulations on a variety of popular geometric and finite element types. The library makes use of high-quality, existing software whenever possible. PETSc or the Trilinos Project are used for the solution of linear systems on both serial and parallel platforms, and LASPack is included with the library to provide linear solver support on serial machines. An optional interface to SLEPc is also provided for solving both standard and generalized eigenvalue problems.


A C++ parallel framework for the multiscale coupling methods dedicated to material simulations. This framework is designed with the form of a library providing an API which makes it possible to program coupled simulations. At the present time, stable implemented coupling method is based on Bridging Method. The coupled parts can be provided by existing projects. In such a manner, the API gives C++ templated interfaces to reduce to the maximum the cost of integration taking the form of plugins or alike. Such codes have been integrated to provide a functional prototype of the framework. For example, molecular dynamics software that have been integrated is Stamp (a code of the CEA) and Lammps (Sandia laboratories). The unique software of continuum mechanics, discretized by finite elements, is based on the libMesh framework.


A C++ library computing a principal component analysis plus corresponding transformations. This requires the Armadillo library.


Library for spherical harmonic transforms. A collection of algorithms for efficient conversion between maps on the sphere and their spherical harmonic coefficients. It supports a wide range of pixelisations (including HEALPix, GLESP, and ECP).


A software tool for creating multiphysics simulation codes.


A a C++ template library for exact, high-performance linear algebra computation with dense, sparse, and structured matrices over the integers and over finite fields.



A platform for development and distribution of scientific software.


The The Large Time/Frequency Analysis Toolbox (LTFAT) is a Matlab/Octave toolbox for working with time-frequency analysis and synthesis. It is intended both as an educational and a computational tool. The toolbox provides a large number of linear transforms including Gabor and wavelet transforms along with routines for constructing windows (filter prototypes) and routines for manipulating coefficients.


An extended version of pdfTeX using Lua as an embedded scripting language. The LuaTeX project’s main objective is to provide an open and configurable variant of TeX while at the same time offering downward compatibility. LuaTeX uses Unicode (as UTF-8) as its default input encoding, and is able to use modern (OpenType) fonts (for both text and mathematics).


Luigi is a Python package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more.

The purpose of Luigi is to address all the plumbing typically associated with long-running batch processes. You want to chain many tasks, automate them, and failures will happen. These tasks can be anything, but are typically long running things like Hadoop jobs, dumping data to/from databases, running machine learning algorithms, or anything else.

There are other software packages that focus on lower level aspects of data processing, like Hive, Pig, or Cascading. Luigi is not a framework to replace these. Instead it helps you stitch many tasks together, where each task can be a Hive query, a Hadoop job in Java, a Python snippet, dumping a table from a database, or anything else. It’s easy to build up long-running pipelines that comprise thousands of tasks and take days or weeks to complete. Luigi takes care of a lot of the workflow management so that you can focus on the tasks themselves and their dependencies.


Locally Weighted Projection Regression (LWPR) is a recent algorithm that achieves nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it uses locally linear models, spanned by a small number of univariate regressions in selected directions in input space. A locally weighted variant of Partial Least Squares (PLS) is employed for doing the dimensionality reduction.

A Python version is available.


An open-source software package for multidimensional data analysis and reproducible computational experiments.


The latest generation of the ECMWF’s Meteorological plotting software MAGICS. Although completely redesigned in Cpp, it is intended to be as backwards-compatible as possible with the Fortran interface. The contour package was rewritten and no longer depends on the CONICON licence. Besides its programming interfaces (Fortran and C), Magics offers MagML, a plot description language based on XML. Magics supports the plotting of contours, wind fields, observations, satellite images, symbols, text, axis and graphs (including boxplots). Data fields to be plotted may be presented in various formats, for instance GRIB 1 and 2 code data, gaussian grid, regularly spaced grid and fitted data. GRIB data is handled via ECMWF’s GRIB API software. Input data can also be in BUFR and NetCDF format or retrieved from an ODB database. The produced meteorological plots can be saved in various formats, such as PostScript, EPS, PDF, GIF, PNG and SVG.


Git-backed Evernote replacement in Python.


A a set of functions for image processing and computer vision in Python.


A workflow engine for executing large complex workflows on clusters, clouds, and grids. Makeflow is very similar to traditional Make, so if you can write a Makefile, then you can write a Makeflow. You can be up and running workflows in a matter of minutes.[]


A flexible and complete framework for building rich web-mapping applications. It emphasizes high productivity, and high-quality development. MapFish is based on the Pylons Python web framework. MapFish extends Pylons with geospatial-specific functionality. For example MapFish provides specific tools for creating web services that allows querying and editing geographic objects. MapFish also provides a complete RIA-oriented JavaScript toolbox, a JavaScript testing environment, and tools for compressing JavaScript code. The JavaScript toolbox is composed of the ExtJS, OpenLayers , GeoExt JavaScript toolkits.


MASA (Manufactured Analytical Solution Abstraction) is a library written in C++ (with C and Fortran90 interfaces) which provides a suite of manufactured solutions for the software verification of partial differential equation solvers in multiple dimensions. MASA provides two methods to import manufactured solutions into the library. Users can either generate their own source terms, or they can use the automatic differentiation capabilities provided in MASA. The method by which solutions can be added to is provided by the "MASA-import" script.


A free software library written to perform vectorized scientific computing and to be as compatible as possible with both GNU Octave and Matlab computing frameworks, offering general purpose, portable and freely available features for the scientific community. Mastrave is mostly oriented to ease complex modelling tasks such as those typically needed within environmental models, even when involving irregular and heterogeneous data series.


Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell, web application servers, and six graphical user interface toolkits.


A library of high-level functions that facilitate making informative and attractive plots of statistical data using matplotlib. It also provides concise control over the aesthetics of the plots, improving on matplotlib’s default look.


An on-the-fly calculator for Monte Carlo methods that uses latin-hypercube sampling (see soerp for the Python implementation of the analytical second-order error propagation original Fortran code SOERP by N. D. Cox) to perform non-order specific error propagation (or uncertainty analysis). The mcerp package allows you to easily and transparently track the effects of uncertainty through mathematical calculations. Advanced mathematical functions, similar to those in the standard math module can also be evaluated directly.


The MATLAB Compiler Runtime (MCR) is a standalone set of shared libraries that enables the execution of compiled MATLAB applications or components on computers that do not have MATLAB installed. When used together, MATLAB, MATLAB Compiler, and the MCR enable you to create and distribute numerical applications or software components quickly and securely.


The Modular toolkit for Data Processing is a collection of supervised and unsupervised learning algorithms and other data processing units that can be combined into data processing sequences and more complex feed-forward network architectures. The base of available algorithms is steadily increasing and includes signal processing methods (Principal Component Analysis, Independent Component Analysis, Slow Feature Analysis), manifold learning methods ([Hessian] Locally Linear Embedding), several classifiers, probabilistic methods (Factor Analysis, RBM), data pre-processing methods, and many others.


A set of software tools for data acquisition and storage and a methodology for management of complex scientific data. MDSplus allows all data from an experiment or simulation code to be stored into a single, self-descriptive, hierarchical structure. The system was designed to enable users to easily construct complete and coherent data sets. The MDSplus programming interface contains only a few basic commands, simplifyng data access even into complex structures. Using the client/server model, data at remote sites can be read or written without file transfers. MDSplus includes x-windows and java tools for viewing data or for modifying or viewing the underlying structures.


A free software media publishing platform that anyone can run. You can think of it as a decentralized alternative to Flickr, YouTube, SoundCloud, etc.



A software extension for MediaWiki that makes it into a powerful environment for collaborating on publication-quality manuscripts and software projects.

Meshing to Realistic Domains

Software for generating properly closed 2d ocean domains for both global and regional simulations.


An open source, portable, and extensible system for the processing and editing of unstructured 3D triangular meshes. The system is aimed to help the processing of the typical not-so-small unstructured models arising in 3D scanning, providing a set of tools for editing, cleaning, healing, inspecting, rendering and converting this kind of meshes.


The Metadata Gathering, Extraction and Transformation Application is a Python application for discovering and extracting metadata from spatial raster datasets (metadata crawler) and transforming it into xml (metadata transformation). A number of generic and specialised imagery formats are supported. The format support has a plugin architecture and more formats can easily be added.


A meteorological workstation application designed to be a complete working environment for both the operational and research meteorologist. Its capabilities include powerful data access, processing and visualisation. It features a powerful icon-based user interface for interactive work, and a scripting language for batch processing. The two are linked through the ability to automatically convert icons into their equivalent script code.



A domain specific language (DSL) devoted to the simulation of biological processes, especially those whose state space must be computed jointly with the current state of the system. MGS embeds the idea of topological collections and their transformations into the framework of a simple dynamically typed functional language. Collections are just new kinds of values and transformations are functions acting on collections and defined by a specific syntax using rules. MGS is an applicative programming language: operators acting on values combine values to give new values, they do not act by side-effect.


An ANSI C library (with C++, Python and MATLAB/OCTAVE wrappers) for Maximal Information-based Nonparametric Exploration (MIC and MINE family).


A Python package for numerical optimisation, being a large collection of standard minimisation algorithms. The name minfx is simply a shortening of the mathematical expression min f(x).


An open source toolkit for students and researchers in power systems. It is designed to make working with ED, OPF, and UC problems simple and intuitive. The goal is to foster collaboration with other researchers and to make learning easier for students.


A unikernel for constructing secure, high-performance network applications across a variety of cloud computing and mobile platforms. Code can be developed on a normal OS such as Linux or MacOS X, and then compiled into a fully-standalone, specialised microkernel that runs under the Xen hypervisor. Since Xen powers most public cloud computing infrastructure such as Amazon EC2, this lets your servers run more cheaply, securely and finer control than with a full software stack.

Mirage is based around the OCaml language, with syntax extensions and libraries which provide networking, storage and concurrency support that are easy to use during development, and map directly into operating system constructs when being compiled for production deployment. The framework is fully event-driven, with no support for preemptive threading.


Implementing and consuming Machine Learning techniques at scale are difficult tasks for ML Developers and End Users. MLbase is a platform addressing the issues of both groups, and consists of three components: MLlib, MLI, ML Optimizer.


A scalable C++ machine learning library with Python bindings.


A Python module for machine learning. It provides a wide range of state-of-the-art machine learning methods for supervised and unsupervised problems and it is aimed at finding a reasonable compromise among modularity, maintainability, reproducibility, usability and efficiency.


Modelica® is a non-proprietary, object-oriented, equation based language to conveniently model complex physical systems containing, e.g., mechanical, electrical, electronic, hydraulic, thermal, control, electric power or process-oriented subcomponents.

JModelica is an extensible Modelica-based open source platform for optimization, simulation and analysis of complex dynamic systems. The main objective of the project is to create an industrially viable open source platform for optimization of Modelica models, while offering a flexible platform serving as a virtual lab for algorithm development and research. As such, provides a platform for technology transfer where industrially relevant problems can inspire new research and where state of the art algorithms can be propagated from academia into industrial use.


OPENMODELICA is an open-source Modelica-based modeling and simulation environment intended for industrial and academic usage.


The goal of Matrices Over Runtime Systems at Exascale (MORSE) project is to design dense and sparse linear algebra methods that achieve the fastest possible time to an accurate solution on large-scale multicore systems with GPU accelerators, using all the processing power that future high end systems can make available. To develop software that will perform well on petascale and exascale systems with thousands of nodes and millions of cores, several daunting challenges have to be overcome, both by the numerical linear algebra and the runtime system communities. By designing a research framework for describing linear algebra algorithms at a high level of abstraction,the MORSE team will enable the strong collaboration between research groups in linear algebra and runtime systems needed to develop methods and libraries that fully benefit from the potential of future large-scale machines. Our project will take a pioneering step in the effort to bridge the immense software gap that has opened up in front of the High-Performance Computing (HPC) community.


MOdular library for raSter bAsed hydrologIcal appliCatiOn.


The Model for Prediction Across Scales (MPAS) is a collaborative project for developing atmosphere, ocean and other earth-system simulation components for use in climate, regional climate and weather studies.

The defining features of MPAS are the unstructured Voronoi meshes and C-grid discretization used as the basis of the model components. The unstructured Voronoi meshes, formally Spherical Centriodal Voronoi Tesselations (SVCTs), allow for both quasi-uniform discretization of the sphere and local refinement. The C-grid discretization, where the normal component of velocity on cell edges is prognosed, is especially well-suited for higher-resolution, mesoscale atmosphere and ocean simulations.


The MIT Multidisciplinary Simulation, Estimation, and Assimilation Systems (MSEAS) group creates, develops and utilizes new mathematical models and computational methods for ocean predictions and dynamical diagnostics, for optimization and control of autonomous ocean observation systems, and for data assimilation and data-model comparisons. Our systems are used for basic and fundamental research and for realistic simulations and predictions in varied regions of the world’s ocean.


The Manifold Toolkit provides easy mechanisms to enable arbitrary algorithms to operate on manifolds. The main application is the use of 3D rotations SO(3), as well as the construction of compound manifolds from arbitrary combinations of sub-manifolds. We also provide a refactored version of the previously released SLoM framework which implements Gauss-Newton and Levenberg-Marquardt-based sparse least-squares optimization on manifolds and a port of MTK to Matlab (MTKM).


The Mimetic Methods Toolkit is a general purpose API for computer simulation of physical phenomena based on Mimetic Discretization Methods. It allows the user to develop numerical models that satisfy physical conservation laws, while preserving even order of accuracy, up to the boundary of the considered domain. A Python wrapper is available.


A Fortran 90 Library containing different subroutines to estimate the Power Spectral Density of real time series.


A Python extension package that provides three new objects, DateTime, DateTimeDelta and RelativeDateTime, which let you store and handle date/time values in a much more natural way than by using ticks (seconds since 1.1.1970 0:00 UTC), the representation used by Python’s time module. You can add, subtract and even multiply instances, pickle and copy them and convert the results to strings, COM dates, ticks and some other more esoteric values. In addition, there are several convenient constructors and formatters at hand to greatly simplify dealing with dates and times in real-world applications. In addition to providing an easy-to-use Python interface the package also exports a comfortable C API interface for other Python extensions to build upon.


Provides an easy to use, high-performance, reliable and robust Python interface to ODBC compatible databases such as MS SQL Server and MS Access, Oracle Database, IBM DB2 and Informix , Sybase ASE and Sybase Anywhere, MySQL, PostgreSQL, SAP MaxDB and many more. ODBC refers to Open Database Connectivity and is the industry standard API for connecting applications to databases. In order to facilitate setting up ODBC connectivity, operating systems typically provide ODBC Managers which help set up the ODBC drivers and manage the binding of the applications against these drivers. On Windows and Mac OS X the ODBC Manager is built into the system. On Unix platforms, you can choose one of the ODBC managers unixODBC, iODBC or DataDirect, which provide the same ODBC functionality on most Unix systems.


The mystic framework provides a collection of optimization algorithms and tools that allows the user to more robustly (and readily) solve optimization problems. All optimization algorithms included in mystic provide workflow at the fitting layer, not just access to the algorithms as function calls. Mystic gives the user fine-grained power to both monitor and steer optimizations as the fit processes are running.

Where possible, mystic optimizers share a common interface, and thus can be easily swapped without the user having to write any new code. Mystic solvers all conform to a solver API, thus also have common method calls to configure and launch an optimization job. For more details, see mystic.abstract_solver. The API also makes it easy to bind a favorite 3rd party solver into the mystic framework.

By providing a robust interface designed to allow the user to easily configure and control solvers, mystic reduces the barrier to implementing a target fitting problem as stable code. Thus the user can focus on building their physical models, and not spend time hacking together an interface to optimization code.


Natron is a free open-source, cross-platform compositing software. It aims to produce visual effects.


ocean isosurfaces

A tool for extracting isosurfaces from oceanographic simulation output, such as from ROMS or HOPS. It also has the ability to compute depth-adjusted means and standard deviations, so that statistical isosurfaces (such as temperature relative to the depth-adjusted mean) may be generated.


A program for exploring longitude/latitude based data stored in NetCDF file format. Ncvtk is built on top of the VTK toolbox. Ncvtk has been designed with the aim of offering a high degree of interactivity to scientists who have a need to explore three-dimensional, time-dependent planetary data. The input data should be stored in a NetCDF file and the metadata should loosely follow the CDC convention. In particular, we support codes that are part of the Flexible Modeling System infrastructure provided the data lie on a longitude/latitude, structured grid.


A WMS for geospatial stored in CF-compliant NetCDF files. ncWMS relies heavily on the Java NetCDF interface from Unidata. This library does a lot of the work of metadata and data extraction. In particular the GridDatatype class is frequently used to provide a high-level interface to gridded geospatial NetCDF files. The library will also read from NetCDF files on HTTP servers and from OPeNDAP servers. ncWMS has now been integrated with the THREDDS Data Server.


Full implementations of 1D, 2D and 3D hydrodynamics and magnetohydrodynamics.


A comprehensive community model that predicts waves, currents, sediment transport and bathymetric change in the nearshore ocean, between the shoreline and about 10 m water depth. The model consists of a "backbone", i.e., the master program, handling data input and output as well as internal storage, together with a suite of "modules", each of which handles a focused subset of the physical processes being studied. A wave module will model wave transformation over arbitrary coastal bathymetry and predict radiation stresses and wave induced mass fluxes. A circulation module will model the slowly varying current field driven by waves, wind and buoyancy forcing, and will provide information about the bottom boundary layer structure. A seabed module will model sediment transport, determine the bedform geometry, parameterize the bedform effect on bottom friction, and compute morphological evolution resulting from spatial variations in local sediment transport rates.


A computational fluid dynamics solver based on the spectral element method.


A state-of-the-art modeling framework for oceanographic research, operational oceanography seasonal forecast and climate studies.


A Python package which implements various diagnostics for NEMO model output.


A robust (fully ACID) transactional property graph database. Due to its graph data model, Neo4j is highly agile and blazing fast. For connected data operations, Neo4j runs a thousand times faster than relational databases.



NetCDF extension for finite element grids.


A library providing high-performance I/O while still maintaining file-format compatibility with Unidata’s NetCDF.


Nonlinear multivariate and time series analysis by neural network methods.


The NFFT (nonequispaced fast Fourier transform or nonuniform fast Fourier transform) is a C subroutine library for computing the nonequispaced discrete Fourier transform (NDFT) and its generalisations in one or more dimensions, of arbitrary input size, and of complex data.


A parallel FFT software library based on MPI.


A parallel software library for the calculation of three-dimensional nonequispaced FFTs based. It is available under GPL licence. The parallelization is based on MPI. PNFFT depends on the PFFT and FFTW software library.


A Python interface for NFFT.


Numerical Information Field Theory is a versatile library designed to enable the development of signal inference algorithms that operate regardless of the underlying spatial grid and its resolution. Its object-oriented framework is written in Python, although it accesses libraries written in Cython, C++, and C for efficiency.

NIFTY offers a toolkit that abstracts discretized representations of continuous spaces, fields in these spaces, and operators acting on fields into classes. Thereby, the correct normalization of operations on fields is taken care of automatically without concerning the user. This allows for an abstract formulation and programming of inference algorithms, including those derived within information field theory. Thus, NIFTY permits its user to rapidly prototype algorithms in 1D, and then apply the developed code in higher-dimensional settings of real world problems. The set of spaces on which NIFTY operates comprises point sets, n-dimensional regular grids, spherical spaces, their harmonic counterparts, and product spaces constructed as combinations of those.


Cloud computing for science.



Nonlinear principal component analysis (NLPCA) is commonly seen as a nonlinear generalization of standard principal component analysis (PCA). It generalizes the principal components from straight lines to curves (nonlinear). Thus, the subspace in the original data space which is described by all nonlinear components is also curved. Nonlinear PCA can be achieved by using a neural network with an autoassociative architecture also known as autoencoder, replicator network, bottleneck or sandglass type network. Such autoassociative neural network is a multi-layer perceptron that performs an identity mapping, meaning that the output of the network is required to be identical to the input. However, in the middle of the network is a layer that works as a bottleneck in which a reduction of the dimension of the data is enforced. This bottleneck-layer provides the desired component values (scores).



NOVAS is an integrated package of subroutines and functions for computing various commonly needed quantities in positional astronomy. The package can provide, in one or two subroutine or function calls, the instantaneous coordinates of any star or planet in a variety of coordinate systems. At a lower level, NOVAS also supplies astrometric utility transformations, such as those for precession, nutation, aberration, parallax, and the gravitational deflection of light. The computations are accurate to better than one milliarcsecond. The NOVAS package is an easy-to-use facility that can be incorporated into data reduction programs, telescope control systems, and simulations. The U.S. parts of The Astronomical Almanac are prepared using NOVAS. Three editions of NOVAS are available: Fortran, C, and Python.

The algorithms used by NOVAS 3.1 are based on a vector and matrix formulation that is rigorous and does not use spherical trigonometry at any point. Objects inside and outside the solar system are treated similarly. The position vectors formed and operated on by NOVAS place each object at its relevant distance (in AU) from the solar system barycenter.

Released in late 2009, NOVAS 3.0 provided greater accuracy of star and planet position calculations (apparent places) by including several small effects not implemented in the NOVAS 2.0 code of 1998. NOVAS 3.0 also fully implemented recent resolutions by the International Astronomical Union (IAU) on positional astronomy, including new reference system definitions and updated models for precession and nutation. The paper by Kaplan et al. (1989, Astron. J. 97, 1197) describes the overall computational strategy used by NOVAS, although many of the individual algorithms described there have been improved. USNO Circular 179 describes the IAU recommendations that underpin much of NOVAS 3.0 and is the basic reference for NOVAS algorithms relating to time, Earth orientation, and the transformations between various astronomical reference systems. The current version, NOVAS 3.1, provides some new capabilities and fixes some bugs.


The non-parametrric entropy estimation toolbox includes estimators for entropy, mutual information, and conditional mutual information for both continuous and discrete variables. Additionally it includes a KL Divergence estimator for continuous distributions and mutual information estimator between continuous and discrete variables.



A package designed to address the problem of non-parametric statistical modeling of probability densities and regression surfaces. This type of modeling becomes very useful when there is little prior information available to justify an assumption that the data belongs to a certain parametric family of distributions or curves. The main intended application is fast and detailed modeling of the response and transfer functions of particle detectors, but the package is sufficiently general and can be used for solving a variety of statistical analysis problems from other areas. Both univariate and multivariate models are supported, and a number of original algorithms are implemented. The capabilities include:

  • Calculation of descriptive sample statistics

  • Arbitrary-dimensional histogramming

  • Parametric, semi-parametric and non-parametric density modeling *Generation of pseudo- and quasi-random numbers according to various density models

  • Non-parametric density interpolation (morphing), including multivariate densities

  • Non-parametric copula modeling and copula-based density interpolation

  • Fast kernel density estimation (KDE) via DFFT

  • Density estimation by local orthogonal polynomial expansion (LOrPE)

  • Density estimation by the nearest neighbors method

  • Local regression techniques: local polynomial least squares, iterative local least trimmed squares, local logistic regression, local quantile regression with and without censoring

  • Expectation-maximization unfolding with smoothing


A general purpose library for multiphysics simulations based on finite elements.


A micromagnetic simulation package.


A data mining benchmark suite containing a mix of several representative data mining applications from different application domains. This benchmark is intended for use in computer architecture research, systems research, performance evaluation, and high-performance computing. The well-known applications assembled in this benchmark suite have been collected from research groups in industry and academia. The applications contain highly optimized versions of the data mining algorithms. Scalable versions of the applications are also provided. Such extensions were designed and implemented by developers at Northwestern University. Currently, the benchmark has applications with algorithms based on clustering, association rules, classification, bayesian network, pattern recognition, support vector machines and several other well known data mining methodologies. These applications are used in diverse fields like bioinformatics, network intrusion, customer relationship management, and marketing.


The Numenta Platform for Intelligent Computing, comprises a set of learning algorithms that were first described in a white paper published by Numenta in 2009. The learning algorithms faithfully capture how layers of neurons in the neocortex learn.


A software allowing synchronized exchanges of coupling information between numerical codes representing different components of the climate system. OASIS3-MCT, the new version of the OASIS coupler interfaced with the Model Coupling Toolkit (MCT) from the Argonne National Laboratory, offers today a fully parallel implementation of coupling field regridding and exchange. Low-intrusiveness, portability and flexibility are OASIS3-MCT key design concepts. OASIS3-MCT supports coupling of general two-dimensional fields. Unstructured grids and 3D grids are also supported using a one dimension representation of the two or three dimensional structures. Thanks to MCT, all transformations, including regridding, are executed in parallel on the set of source or target component processes and all coupling exchanges are now executed in parallel directly between the components via Message Passing Interface (MPI). OASIS3-MCT also supports file I/O using NetCDF, allowing an easy switch between the coupled and forced modes. In the current version, the implementation of this functionality is however non parallel with the reading/writing of the fields performed by the master process only.


KML currently handles spatial and temporal tags, but not data content. The below is an attempt to devise a content schema and mapping which would allow not only the display of data content, but also some meaningful data sharing within the observations community using KML/KMZ as a data transport mechanism.


A package in the R statistical language that helps Oceanographers do their work.


Ocean C-grid model setup and analysis tools, for the numerical mariner.


Provides interactive exploration, analysis and visualization of oceanographic and other geo-referenced profile or sequence data. It is available for all major computer platforms and currently has more than 20,000 registered users. ODV has a very rich set of interactive capabilities and supports a very wide range of plot types. This makes ODV ideal for visual and automated quality control.


The OpenFabrics Enterprise Distribution (OFED™) is open-source software for RDMA and kernel bypass applications. OFED includes kernel-level drivers, channel-oriented RDMA and send/receive operations, kernel bypasses of the operating system, both kernel and user-level application programming interface (API) and services for parallel message passing (MPI), sockets data exchange (e.g., RDS, SDP), NAS and SAN storage (e.g. iSER, NFS-RDMA, SRP) and file system/database systems. The network and fabric technologies that provide RDMA performance with OFED include: legacy 10 Gigabit Ethernet, iWARP for Ethernet, RDMA over Converged Ethernet (RoCE), and 10/20/40 Gigabit InfiniBand.


A self-contained C++ class library for the automatic layout of diagrams. OGDF offers sophisticated algorithms and data structures to use within your own applications or scientific projects.


A command line utility that converts a graph layout stored as GML file into a graphics file. Supported graphics formats are the bitmap formats PNG, JPEG, TIFF and the vector graphics formats SVG, PDF, EPS. It is based on OGDF and uses Qt 4 for high-quality graphics rendering.


An optimizing compiler for the Itanium and x86-64 microprocessor architectures. It derives from the SGI compilers for the MIPS R10000 processor, called MIPSPro. It was initially released in 2000 as GNU GPL software under the name Pro64. The following year, University of Delaware adopted the project and renamed the compiler to Open64. It now mostly serves as a research platform for compiler and computer architecture research groups. Open64 supports Fortran 77/95 and C/C++, as well as the shared memory programming model OpenMP. It can conduct high-quality interprocedural analysis, data-flow analysis, data dependence analysis, and array region analysis.


A tool for automatic differentiation of numerical computer programs.


An optimized BLAS library based on GotoBLAS2 1.13 BSD version.


OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products. The library has more than 2500 optimized algorithms, which includes a comprehensive set of both classic and state-of-the-art computer vision and machine learning algorithms. These algorithms can be used to detect and recognize faces, identify objects, classify human actions in videos, track camera movements, track moving objects, extract 3D models of objects, produce 3D point clouds from stereo cameras, stitch images together to produce a high resolution image of an entire scene, find similar images from an image database, remove red eyes from images taken using flash, follow eye movements, recognize scenery and establish markers to overlay it with augmented reality, etc.


An open interface standard for (and free implementation of) a set of tools to quickly implement data-assimilation and calibration for arbitrary numerical models. OpenDA wants to stimulate the use of data-assimilation and calibration by lowering the implementation costs and enhancing the exchange of software among researchers and end-users.


OpenFOAM has an extensive range of features to solve anything from complex fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics and electromagnetics. It includes tools for meshing, notably snappyHexMesh, a parallelised mesher for complex CAD geometries, and for pre- and post-processing. Almost everything (including meshing, and pre- and post-processing) runs in parallel as standard, enabling users to take full advantage of computer hardware at their disposal.


An open source C++ toolkit designed to assist the creative process by providing a simple and intuitive framework for experimentation. The toolkit is designed to work as a general purpose glue, and wraps together several commonly used libraries.


A C++ template library for discrete factor graph models and distributive operations on these models. It includes state-of-the-art optimization an