List of Awesome Lists -

52 North Github -

Clever Algorithms: Nature-Inspired Programming Recipes -

Extra Packages for Enterprise Linux -

Experimental Mathematics -

Frontier Nerds -

Hacking for Artists -

IBM Developerworks Open Source -

Libre Graphics World -

Linux Virtualization Wiki -

NASA History Series Publications -

National Science Library -

Nature of Code (The) -

New General Catalog of Old Books and Authors -

Practical Common Lisp -

Pythonic Perambulations - -

txt2re Regular Expression Generator -

Virt Tools Blog Planet -


Fedora People Repositories -

Fedora Third Party Repos -

RepoForge -

RPM Fusion -

GPGPU, CUDA and OpenCL -

GPU Computing at BOINC (2015-03-09) -

Code Optimization and Performance Analysis of Oceanographic Software Package NEMO for GPGPU Systems (2014) -

GPU Computing: Which Card Should You Get? (2013-12-25) -

CUDA Tutorial from Colorado School of Mines (2010?) -

Nvidia’s CUDA: The End of the CPU? (2008-06-18) -,1954.html



BeagleBone Black -

littleBits -

LowPowerLab -

New Section

0 | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z


3DEX is a Fortran/CXX package providing programs and functions to perform fast Fourier-Bessel decomposition of 3D fields. It can be applied to cosmological data or 3D data in spherical coordinates in other scientific fields. We present an equivalent formulation of the spherical Fourier-Bessel decomposition that separates radial and tangential calculations. We propose the use of the existing pixelisation scheme HEALPix for a rapid calculation of the tangential modes. 3DEX (3D EXpansions) is a public code for fast spherical Fourier-Bessel decomposition of 3D all-sky surveys that takes advantage of HEALPix for the calculation of tangential modes. 3DEX can also be used in other disciplines, where 3D data are to be analysed in spherical coordinates.


An approximate compiler for C and C++ programs based on Clang. Think of it as your assistant in breaking your program in small ways to trade off correctness for performance.


AMD Core Math Library, or ACML, provides a free set of thoroughly optimized and threaded math routines for HPC, scientific, engineering and related compute-intensive applications. ACML is ideal for weather modeling, computational fluid dynamics, financial analysis, oil and gas applications and more. ACML consists of the following main components:

  • A full implementation of Level 1, 2 and 3 Basic Linear Algebra Subroutines (BLAS), with key routines optimized for high performance on AMD Opteron™ processors. The BLAS level 3 routines will take advantage of heterogeneous computing through OpenCL if detected.

  • A full suite of Linear Algebra (LAPACK) routines. As well as taking advantage of the highly-tuned BLAS kernels, a key set of LAPACK routines has been further optimized to achieve considerably higher performance than standard LAPACK implementations.

  • Beginning version 6 of ACML, a subset of FFTW interfaces are supported for Fourier transform functionality. Heterogeneous compute with GPU/APU and OpenCL is supported through the FFTW interfaces. A comprehensive set of FFTs through ACML specific API (found in version 5 and older) continues to be available in version 6.

  • Random Number Generators in both single- and double-precision.

Active Papers

ActivePapers is a research and development project whose aim is to make computational science more open and more reliable, by making computational reproducible and publishable. It is a file format for storing computations.

An ActivePaper is a file combining datasets and programs working on these datasets in a single package, which also contains a detailed history of which data was produced when, by running which code, and on which machine. It is a complete record of the state of a computational research project that can be shared among collaborators and in the end published as supplementary material to a journal article.


An approximate MAP decoder with Alternating Direction Dual Decomposition. AD3 (Alternating Directions Dual Decomposition) is an LP-MAP decoder for undirected constrained factor graphs. In other words, it is an approximate MAP decoder that retrieves the solution of an LP relaxation of the original problem.

The input is a factor graph, which may contain both soft factors, associated with log-potentials, and hard constraint factors, associated with a logic function. Factors can be dense, sparse, or combinatorial. Specialized factors can be implemented by the practitioner.

The output is the LP-MAP assignment, with a posterior value for each variable. If all variables are integer, the relaxation is tight and the solution is the true MAP. Otherwise, some entries can be in the unit interval. External tools can be used to obtain a valid solution using rounding heuristics. Optionally, a flag can be set that applies a branch-and-bound procedure and retrieves the true MAP (but it can be slow if the relaxation has many fractional components).


Adaptive Hydraulics (AdH) is a modern, multi-dimensional modeling system for saturated and unsaturated groundwater, overland flow, three-dimensional Navier-Stokes flow, and two- or three-dimensional shallow water problems. Developed by the Coastal and Hydraulics Laboratory at the Engineer Research and Development Center in Vicksburg, MS, the 2-dimensional (2D) shallow water module of AdH was released to the public in September 2007.


This site outlines the Aquatic Ecodynamics (AED) modelling library - an open-source community-driven library of model components for simulation of "aquatic ecodynamics" - water quality, habitat and aquatic ecosystem dynamics.

The AED library consists of numerous modules that are designed as individual model ‘components’ able to be configured in a way that facilitates custom aquatic ecosystem conceptualisations – either simple or complex. Users select water quality and ecosystem variables they wish to simulate and then are able to customize connections and dependencies with other modules, including support for easy customisation at an algorithm level how model components operate (e.g. photosynthesis functions, sorption algorithms etc). In general, model components consider the cycling of carbon, nitrogen and phosphorus, and other relevant components such as oxygen, and are able to simulate organisms including different functional groups of phytoplankton and zooplankton, and also organic matter. Modules to support simulation of water column and sediment geochemistry, including coupled kinetic-equilibria, are also included.


The Stochastic Simulation Algorithm (SSA) developed by Gillespie provides a powerful mechanism for exploring the behavior of chemical systems with small species populations or with important noise contributions. Gene circuit simulations for systems biology commonly employ the SSA method, as do ecological applications. This algorithm tends to be computationally expensive, so researchers seek an efficient implementation of SSA. In this program package, the Accelerated Exact Stochastic Simulation Algorithm (AESS) contains optimized implementations of Gillespieʼs SSA that improve the performance of individual simulation runs or ensembles of simulations used for sweeping parameters or to provide statistically significant results.


Akaros is an open source, GPL-licensed operating system for manycore architectures. Our goal is to provide support for parallel and high-performance applications and to scale to a large number of cores.


Albany is an implicit, unstructured grid, finite element code for the solution and analysis of partial differential equations. Albany is the main demonstration application of the AgileComponents software development strategy at Sandia. It is a PDE code that strives to be built almost entirely from functionality contained within reusable libraries (such as Trilinos/STK/Dakota/PUMI). Albany plays a large role in demonstrating and maturing functionality of new libraries, and also in the interfaces and interoperability between these libraries. It also serves to expose gaps in our coverage of capabilities and interface design.

The highlight of Albany is the PDE assembly. The template-based generic programming approach allows developers to just program for residual equations, and all manner of derivatives and polynomial propagations get automatically computed with no development effort. This approach uses Phalanx for rapid and flexible addition of physics, which works closely with Sacado and Stokhos for automatic propagation of derivatives and UQ. The Trilinos Intrepid and Shards packages are used for the local discretization. A second strength of Albany is the demonstration of transformational analysis algorithms. Albany demonstrates the direct use of all Solver/Analysis tools in Trilinos (through Piro, which was developed in Albany) including NOX, LOCA, Rythmos, Stokhos, and all of Dakota. On any problem we not only get a solution, but can also get sensitivities, run optimization problems, and perform uncertainty quantification. All of these approaches can access all of the linear solver options in Trilinos that are exposed by the Stratimikos layer. The third main strength is the early adoption of STK, the sierra toolkit libraries. This includes the mesh database, IO, and mesh adaptation capabilities from stk_rebalance and stk_adapt.


The Arcade Learning Environment (ALE) is a simple object-oriented framework that allows researchers and hobbyists to develop AI agents for Atari 2600 games. It is built on top of the Atari 2600 emulator Stella and separates the details of emulation from agent design.


Amahi is software that runs on a dedicated PC as a central computer for your home. It handles your entertainment, storage, and computing needs. You can store, organize and deliver your recorded TV shows, videos and music to media devices in your network. Share them locally or safely around the world. And it’s expandable with a multitude of one-click install apps.


AMD LibM is a software library containing a collection of basic math functions optimized for x86-64 processor based machines. It provides many routines from the list of standard C99 math functions. AMD LibM is a C library, which users can link in to their applications to replace compiler-provided math functions. Generally, programmers access basic math functions through their compiler. But those who want better accuracy or performance than their compiler’s math functions can use this library to help improve their applications. Users can also take advantage of the vector functions in this library. The vector variants can be used to speed up loops and perform math operations on multiple elements conveniently.


AMG2013 is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. It has been derived directly from the BoomerAMG solver in the hypre library, a large linear solver library that is being developed in the Center for Applied Scientific Computing (CASC) at LLNL. The driver provided in the benchmark can build various test problems. The default problem is a Laplace type problem on an unstructured domain with various jumps and an anisotropy in one part.

AMG2013 is written in ISO-C. It is an SPMD code which uses MPI as well as OpenMP. Parallelism is achieved by data decomposition. The driver provided with AMG2013 achieves this decomposition by simply subdividing the grid into logical P x Q x R (in 3D) chunks of equal size. The benchmark was designed to test parallel weak scaling efficiency.

AMG2013 is a highly synchronous code. The communications and computations patterns exhibit the surface-to-volume relationship common to many parallel scientific codes. Hence, parallel efficiency is largely determined by the size of the data "chunks" mentioned above, and the speed of communications and computations on the machine. AMG2013 is also memory-access bound, doing only about 1-2 computations per memory access, so memory-access speeds will also have a large impact on performance.


AMGCL is a C++ header only library for constructing an algebraic multigrid (AMG) hierarchy. AMG is one the most effective methods for solution of large sparse unstructured systems of equations, arising, for example, from discretization of PDEs on unstructured grids [5,6]. The method can be used as a black-box solver for various computational problems, since it does not require any information about the underlying geometry. AMG is often used not as a standalone solver but as a preconditioner within an iterative solver (e.g. Conjugate Gradients, BiCGStab, or GMRES).

AMGCL builds the AMG hierarchy on a CPU and then transfers it to one of the provided backends. This allows for transparent acceleration of the solution phase with help of OpenCL, CUDA, or OpenMP technologies. Users may provide their own backends which enables tight integration between AMGCL and the user code.

Anaconda Accelerate

Accelerate is an add-on to Continuum’s free enterprise Python distribution, Anaconda. It opens up the full capabilities of your GPU or multi-core processor to Python. Accelerate includes two packages that can be added to your Python installation: NumbaPro and MKL Optimizations. MKL Optimizations makes linear algebra, random number generation, Fourier transforms, and many other operations run faster and in parallel. NumbaPro builds fast GPU and multi-core machine code from easy-to-read Python and NumPy code with a Python-to-GPU compiler.

If you are an academic at a degree-granting institution, all of these add-ons are free of charge. Simply click Anaconda Academic License and fill out the form. If your email address ends in .edu or is in our list of approved academic institutions, the license will be automatically sent to the provided email.

Accelerated Computing with Python

Python is one of the fastest growing and most popular programming languages available. However, as an interpreted language, it has been considered too slow for high-performance computing. That has now changed with the release of the NumbaPro Python compiler from Continuum Analytics.

CUDA Python – Using the NumbaPro Python compiler, which is part of the Anaconda Accelerate package from Continuum Analytics, you get the best of both worlds: rapid iterative development and all other benefits of Python combined with the speed of a compiled language targeting both CPUs and NVIDIA GPUs.


Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy. Avoid writing scripts or custom code to deploy and update your applications— automate in a language that approaches plain English, using SSH, with no agents to install on remote systems.

Not only can it be used for automated configuration management, but it also excels at orchestration, provisioning of systems, zero-time rolling updates and application deployment. Ansible can be used to keep all your systems configured exactly the way you want them, and if you have many identical systems, Ansible will ensure they stay identical. For Linux system administrators, Ansible is an indispensable tool in implementing and maintaining a strong security posture.

Ansible can be used to deploy and configure multiple Linux servers (Red Hat, Debian, CentOS, OS X, any of the BSDs and others) using secure shell (SSH) instead of the more common client-server methodologies used by other configuration management packages.


Ansible playbooks for deploying OpenStack.


A set of Ansible playbooks to build and maintain your own private cloud: email, calendar, contacts, file sync, IRC bouncer, VPN, and more.


Antik provides a foundation for scientific and engineering computation in Common Lisp. It is designed not only to facilitate numerical computations, but to permit the use of numerical computation libraries and the interchange of data and procedures, whether foreign (non-lisp) or Lisp libraries. It is named after the Antikythera mechanism, one of the oldest examples of a scientific computer known.


ANUGA is a Free & Open Source Software (FOSS) package capable of modelling the impact of hydrological disasters such as dam breaks, riverine flooding, storm-surge or tsunamis.

ANUGA is based on the Shallow Water Wave Equation discretised to unstructured triangular meshes using a finite-volumes numerical scheme. A major capability of ANUGA is that it can model the process of wetting and drying as water enters and leaves an area. This means that it is suitable for simulating water flow onto a beach or dry land and around structures such as buildings. ANUGA is also capable of modelling difficult flows involving shock waves and rapidly changing flow speed regimes (transitions from sub critical to super critical flows).


AnyDSL is a framework for the rapid development of domain-specific languages (DSLs). AnyDSL’s main ingredient is AnyDSL’s intermediate representation Thorin. In contrast to other intermediate representations, Thorin features certain abstractions which allow to maintain domain-specific types and control-flow.

As creating a front-end for some language is a complex and time-consuming endeavor, we offer Impala. This is an imperative language which features as a basis well-known imperative constructs. A DSL developer can hijack Impala such that desired domain-specific types and constructs are available in Impala simply by declaring them. The DSL developer just reuses Impala’s infrastructure (lexer, parser, semantic analysis, and code generator). He does not need to develop his own front-end. Even more important: The decision how to implement domain-specific details is postponed to the expert of the target machine.


AMD OpenCL™ Accelerated Parallel Processing (APP) technology is a set of advanced hardware and software technologies that enable AMD graphics processing cores (GPU), working in concert with the system’s x86 cores (CPU), to execute heterogeneously to accelerate many applications beyond just graphics. This enables better balanced platforms capable of running demanding computing tasks faster than ever, and sets software developers on the path to optimize for AMD Accelerated Processing Units (APUs). The AMD APP Software Development Kit (SDK) is a complete development platform created by AMD to allow you to quickly and easily develop applications accelerated by AMD APP technology. The SDK provides samples, documentation, and other materials to quickly get you started leveraging accelerated compute using OpenCL™, Bolt, or C AMP in your C/C application, or Aparapi for your Java application.


This package supports a flexible, arbitrarily high level of numeric precision — the equivalent of hundreds or even thousands of decimal digits (up to approximately ten million digits if needed). Special routines are provided for extra-high precision (above 1000 digits). The entire library is written in C. High-precision real, integer and complex datatypes are supported. Both C and Fortran-90 translation modules are also provided that permit one to convert an existing C++ or Fortran-90 program to use the library with only minor changes to the source code. In most cases only the type statements and (in the case of Fortran-90 programs) read/write statements need be changed. Six implementations of PSLQ (one-, two- and three-level, regular and multi-pair) are included, as well as three high-precision quadrature programs. New users are encouraged to use this package, rather than MPFUN90 or MPFUN77 (see below).

This verion of the ARPREC package now includes "The Experimental Mathematician’s Toolkit", which is available as the program "mathtool" in the subdirectory "toolkit". This is a complete interactive high-precision arithmetic computing environment. One enters expressions in a Mathematica-style syntax, and the operations are performed using the ARPREC package, with a level of precision that can be set from 100 to 1000 decimal digit accuracy. Variables and vector arrays can be defined and referenced. This program supports all basic arithmetic operations, common transcendental and combinatorial functions, multi-pair PSLQ (one-, two- or three-level versions), high-precision quadrature, i.e. numeric integration (Gaussian, error function or tanh-sinh), and summation of series.


ArrayFire is a high performance software library for parallel computing with an easy-to-use API. Its array based function set makes parallel programming simple. ArrayFire’s multiple backends (CUDA, OpenCL and native CPU) make it platform independent and highly portable. A few lines of code in ArrayFire can replace dozens of lines of parallel computing code, saving you valuable time and lowering development costs.


AsciiDoc is a text document format for writing notes, documentation, articles, books, ebooks, slideshows, web pages, man pages and blogs. AsciiDoc files can be translated to many formats including HTML, PDF, EPUB, man page. AsciiDoc is highly configurable: both the AsciiDoc source file syntax and the backend output markups (which can be almost any type of SGML/XML markup) can be customized and extended by the user.

See also Magic-Book-Project.


AsciidocToGo is a full featured portable version of asciidoc that contains the complete toolchain to build html or docbook/latex based PDF documentation out of plain ascii txt files. Just download AsciidocToGo and start writing instead of seaching day or maybe weeks to put together all of the the required software parts.


Asciidoctor is a fast text processor and publishing toolchain for converting AsciiDoc content to HTML5, DocBook 5 (or 4.5) and other formats. The Asciidoctor project is an effort to bring a comprehensive and accessible publishing toolchain, centered around the AsciiDoc syntax, to a growing range of ecosystems, including Ruby, JavaScript and the JVM.

In addition to the standard AsciiDoc syntax, Asciidoctor recognizes additional markup and formatting options, such as font-based icons (e.g., fire) and UI elements (e.g., button:[Save]). Asciidoctor also offers a modern, responsive theme based on Foundation to style the HTML5 output.

In addition to an AsciiDoc processor and a collection of stylesheets, the project provides plugins for Maven, Gradle and Guard and packages for operating systems such as Fedora, Debian and Ubuntu. It also pushes AsciiDoc to evolve by introducing new ideas and innovation and helps promote AsciiDoc through education and advocacy.


An assortment of backends (i.e., templates) for Asciidoctor, a pure Ruby port of the AsciiDoc markup language. In this repository, you’ll find replicas of both the html5 and docbook45 backends from AsciiDoc (and Asciidoctor) written in both Haml and Slim, as well as backends for generating HTML5 presentations from AsciiDoc.


Asciidoctor EPUB3 is a set of Asciidoctor extensions for converting AsciiDoc to EPUB3 & KF8/MOBI.


A Gradle plugin that uses Asciidoctor via JRuby to process AsciiDoc source files within the project.


Asciidoctor LaTeX is a set of Asciidoctor extensions for converting AsciiDoc to LaTeX.


A native PDF renderer for AsciiDoc based on Asciidoctor and Prawn.


JavaScript port of Asciidoctor produced by Opal, a Ruby to JavaScript cross compiler.


DocGist is a URL proxy tool that converts AsciiDoc documents fetched from Gists, GitHub repositories, Dropbox folders and other sources to HTML. The conversion to HTML is performed in the browser (client-side) using the Asciidoctor.js JavaScript library. DocGist can render documents located anywhere, as long as the host permits cross-domain access.


The Magic Book Project is an open-source framework that facilitates the design and production of electronic and print books for authors. Rather than type into a word processor, the Magic Book Project allows an author to write a book once (using ASCIIDOC, a simple text document format) and procedurally generate the layout for a variety of formats using modern code-based design tools, such as CSS, the stylesheet standard. Write your book once, press a magic button, and out come multiple versions: printed hardcopy, digital PDF, HTML, MOBI, and EPUB.


The MPLW is Matplotlib (MPL) wrapper, which can work as AsciiDoc filter. Using this filter you can generate plots from inline matplotlib scripts.


A complete editor for structured text documents with proofreading features. RTextDoc is designed for typesetting professional research papers using LaTeX that are heavy on mathematics and images. In addition, it is designed for writing notes, books, ebooks, slideshows, web pages, man pages and blogs using AsciiDoc mark-up language. RTextDoc also supports DocBook.


Real-time collaborative editor for AsciiDoc file.


asciinema is a free and open source solution for recording the terminal sessions and sharing them on the web. When you run asciinema rec in your terminal the recording starts, capturing all output that is being printed to your terminal while you’re issuing the shell commands. When the recording finishes (by hitting Ctrl-D or typing exit) then the captured output is uploaded to website and prepared for playback on the web.


All three methods have been implemented in the new MAPLE package ASP (Automated Symmetry Package) which is an add-on to the MAPLE symmetry package DESOLVII (Vu, Jefferson and Carminati (2012) [25]). To our knowledge, this is the first computer package to automate all three methods of determining approximate symmetries for differential systems. Extensions to the theory have also been suggested for the third method and which generalise the first method to systems of differential equations. Finally, a number of approximate symmetries and corresponding solutions are compared with results in the literature.


Assimulo is a simulation package for solving ordinary differential equations. It is written in the high-level programming language Python and combines a variety of different solvers written in FORTRAN, C and even Python via a common high-level interface. The primary aim of Assimulo is not to develop new integration algorithms. The aim is to provide a high-level interface for a wide variety of solvers, both new and old, solvers of industrial standards as well as experimental solvers. The aim is to allow comparison of solvers for a given problem without the need to define the problem in a number of different programming languages to accommodate the different solvers.


A text editor for the 21st century.


Authorea is the collaborative platform for research. Write and manage your technical documents in one place.

Azimuth Project

The Azimuth Project is an international collaboration to create a focal point for scientists and engineers interested in saving the planet. Our goal is to make clearly presented, accurate information on the relevant issues easy to find, and to help people work together on our common problems.


A new research operating system being built from scratch. We are exploring how to structure an OS for future multi- and many-core systems. We are motivated by two closely related trends in hardware design: first, the rapidly growing number of cores, which leads to a scalability challenge, and second, the increasing diversity in computer hardware, requiring the OS to manage and exploit heterogeneous hardware resources.

Barrelfish is “multikernel” operating system [3]: it consists of a small kernel running on each core (one kernel per core), and while rest of the OS is structured as a distributed system of single-core processes atop these kernels. Kernels share no memory, even on a machine with cache-coherent shared RAM, and the rest of the OS does not use shared memory except for transferring messages and data between cores, and booting other cores. Applications can use multiple cores and share address spaces (and therefore cache-coherent shared memory) between cores, but this facility is provided by user-space runtime libraries.


Baudline is a time-frequency browser designed for scientific visualization of the spectral domain. Signal analysis is performed by Fourier, correlation, and raster transforms that create colorful spectrograms with vibrant detail. Conduct test and measurement experiments with the built in function generator, or play back audio files with a multitude of effects and filters. The baudline signal analyzer combines fast digital signal processing, versatile high speed displays, and continuous capture tools for hunting down and studying elusive signal characteristics.


BDMPI is a message passing library and associated runtime system for developing out-of-core distributed computing applications for problems whose aggregate memory requirements exceed the amount of memory that is available on the underlying computing cluster. BDMPI is based on the Message Passing Interface (MPI) and provides a subset of MPI’s API along with some extensions that are designed for BDMPI’s memory and execution model.

A BDMPI-based application is a standard memory-scalable parallel MPI program that was developed assuming that the underlying system has enough computational nodes to allow for the in-memory execution of the computations. This program is then executed using a sufficiently large number of processes so that the per-process memory fits within the physical memory available on the underlying computational node(s). BDMPI maps one or more of these processes to the computational nodes by relying on the OS’s virtual memory management to accommodate the aggregate amount of memory required by them. BDMPI prevents memory thrashing by coordinating the execution of these processes using node-level co-operative multi-tasking that limits the number of processes that can be running at any given time. This ensures that the currently running process(es) can establish and retain memory residency and thus achieve efficient execution. BDMPI exploits the natural blocking points that exist in MPI programs to transparently schedule the co-operative execution of the different processes. In addition, BDMPI’s implementation of MPI’s communication operations is done so that to maximize the time over which a process can execute between successive blocking points. This allows it to amortize the cost of loading data from disk over the maximal amount of computations that can be performed.

Since BDMPI is based on the standard MPI library, it also provides a framework that allows the automated out-of-core execution of existing MPI applications. BDMPI is implemented in such a way so that to be a drop-in replacement of existing MPI implementations and allow existing codes that utilize the subset of MPI functions implemented by BDMPI to compile unchanged.


Beaker is a notebook-style development environment for working interactively with large and complex datasets. Its plugin-based architecture allows you to switch between languages or add new ones with ease, ensuring that you always have the right tool for any of your analysis and visualization needs.


A high-performance parallel file system from the Fraunhofer Center for High Performance Computing. BeeGFS is a pure software solution for scale-out parallel network-accessible storage, developed with a strong focus on performance and designed for very easy installation and management. If I/O intensive workloads are your problem, BeeGFS is the solution.

BeeGFS provides a common file system for shared access to multiple clients and transparently spreads user data across multiple servers. By increasing the number of servers and/or disks in the system, you can simply scale performance and capacity of the file system to the level that you need.

BID Data Project

The BID Data Suite is a collection of hardware, software and design patterns that enable fast, large-scale data mining at very low cost. The software consists of two parts:

  • BIDMat, an interactive matrix library that integrates CPU and GPU acceleration and novel computational kernels.

  • BIDMach, a machine learning system that includes very efficient model optimizers and mixing strategies.

BIDMach is an interactive environment designed to make it extremely easy to build and use machine learning models. BIDMach includes core classes that take care of managing data sources, optimization and distributing data over CPUs or GPUs. It’s very easy to write your own models by generalizing from the models already included in the Toolkit.


Virtual large arrays and lazy evaluation.


BigView allows for interactive panning and zooming of images of arbitrary size on desktop PCs running Linux. Additionally, it can work in a multi-screen environment where multiple PCs cooperate to view a single, large image. Using this software, one can explore — on relatively modest machines — images such as the Mars Orbiter Camera mosaic [92,160×33,280 pixels].

The images must be first converted into “paged” format, where the image is stored in 256×256 “pages” to allow rapid movement of pixels into texture memory. The format contains an “image pyramid”: a set of scaled versions of the original image. Each scaled image is 1/2 the size of the previous, starting with the original down to the smallest, which fits into a single 256×256 page.


A repository for Conda binaries, amongst other things.

A repository for Conda binaries, amongst other things.

Rich Signell’s Binstar -


The Basic Linear Algebra Subprograms (BLAS) are a specified set of low-level subroutines that perform common linear algebra operations such as copying, vector scaling, vector dot products, linear combinations, and matrix multiplication. They were first published as a Fortran library in 1979 and are still used as a building block in higher-level math programming languages and libraries.

See also ACML.


Automatically Tuned Linear Algebra Software (ATLAS) is a software library for linear algebra. It provides a mature open source implementation of BLAS APIs for C and Fortran77. ATLAS is often recommended as a way to automatically generate an optimized BLAS library. While its performance often trails that of specialized libraries written for one specific hardware platform, it is often the first or even only optimized BLAS implementation available on new systems and is a large improvement over the generic BLAS available at Netlib. For this reason, ATLAS is sometimes used as a performance baseline for comparison with other products.

This site contains the official reference implementation of BLAS, from which all others have flowed. There are Fortran and C versions which should compile just about anywhere, although they are not optimized for a specific processor beyond the capabilities of the compiler used.


A software framework for instantiating high-performance BLAS-like dense linear algebra libraries. The BLAS-like Library Instantiation Software (BLIS) is a framework for rapid instantiation of high-performance libraries with Basic Linear Algebra Subprograms (BLAS) functionality.

Build to Order BLAS

The Build to Order BLAS system is a compiler that generates high-performance implementations of basic linear algebra kernels.

The term BLAS in the name is for Basic Linear Algebra Subprograms. The BLAS is a standard API for important linear algebra operations. The BLAS are implemented by most hardware vendors. Traditionally, each routine in the BLAS is implemented by hand by a highly skilled programmer. The Build to Order BLAS compiler automates the implementation of not only the BLAS standard but also any sequence of basic linear algebra operations.

The user of the Build to Order BLAS compiler writes down a specification for a sequence of matrix and vector operations together with a description of the input and output parameters. The compiler then tries out many different choices of how to implement, optimize, and tune those operations for the user’s computer hardware. The compiler choices the best option, which is output as a C file containing a function that implements the specified operations.


This repository houses the code for the OpenCL™ BLAS portion of clMath. The complete set of BLAS level 1, 2 & 3 routines is implemented. Please see Netlib BLAS for the list of supported routines. In addition to GPU devices, the library also supports running on CPU devices to facilitate debugging and multicore programming. APPML 1.10 is the most current generally available pre-packaged binary version of the library available for download for both Linux and Windows platforms.

The primary goal of clBLAS is to make it easier for developers to utilize the inherent performance and power efficiency benefits of heterogeneous computing. clBLAS interfaces do not hide nor wrap OpenCL interfaces, but rather leaves OpenCL state management to the control of the user to allow for maximum performance and flexibility. The clBLAS library does generate and enqueue optimized OpenCL kernels, relieving the user from the task of writing, optimizing and maintaining kernel code themselves.


clMath is the open-source project for OpenCL based BLAS and FFT libraries. The complete set of BLAS level 1, 2 & 3 routines is implemented.


The NVIDIA CUDA Basic Linear Algebra Subroutines (cuBLAS) library is a GPU-accelerated version of the complete standard BLAS library that delivers 6x to 17x faster performance than the latest MKL BLAS. New in CUDA 6.0 is multi-GPU support in cuBLAS-XT.


A set of routines which accelerate Level 3 BLAS (Basic Linear Algebra Subroutine) calls by spreading work across more than one GPU. By using a streaming design, cuBLAS-XT efficiently manages transfers across the PCI-Express bus automatically, which allows input and output data to be stored on the host’s system memory. This provides out-of-core operation – the size of operand data is only limited by system memory size, not by GPU on-board memory size.

Starting with CUDA 6.0, a free version of cuBLAS-XT is included in the CUDA toolkit as part of the cuBLAS library.


KBLAS (KAUST-BLAS) is a small open-source library that optimizes critical numerical kernels on CUDA-enabled GPUs. KBLAS provides a subset of standard BLAS functions. It also proposes some function with BLAS-like interface that target both single and multi- GPU systems.

The ultimate goal for KBLAS is performance. KBLAS has a set of tuning parameters that affect its performance according to the GPU architecture, and the CUDA runtime version. While we cannot guarantee optimal performance with the default tuning parameters, the user can easily edit such parameters on his local system. KBLAS might be shipped with autotuners in the future.


This C++ class library introduces Matrix, Vector, subMatrices, and LAStreams over the real domain. The library contains efficient and fool-proof implementations of level 1 and 2 BLAS (element-wise operations and various multiplications), transposition, determinant evaluation and matrix inverse. There are operations on a single row/col/diagonal of a matrix. Distinct features of the package are Matrix views, Matrix streams, and LazyMatrices. Lazy construction allows us to write matrix expressions in a natural way without introducing any hidden temporaries, deep copying, and any reference counting.


OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.


A high performance C++ implementation of BLAS (Basic Linear Subprograms). Standard conforming interfaces for C and Fortran are provided.


Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) is a package, written in C and MATLAB/OCTAVE, that includes an eigensolver implemented with the Locally Optimal Block Preconditioned Conjugate Gradient Method (LOBPCG). Its main features are: a matrix-free iterative method for computing several extreme eigenpairs of symmetric positive generalized eigenproblems; a user-defined symmetric positive preconditioner; robustness with respect to random initial approximations, variable preconditioners, and ill-conditioning of the stiffness matrix; and apparently optimal convergence speed.

BLOPEX supports parallel MPI-based computations. BLOPEX is incorporated in the HYPRE package and is available as an external block to the PETSc package. SLEPc and PHAML have interfaces to call BLOPEX eigensolvers.


A blocking, shuffling and loss-less compression library. Blosc is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call. Blosc is the first compressor (that I’m aware of) that is meant not only to reduce the size of large datasets on-disk or in-memory, but also to accelerate memory-bound computations (which is typical in vector-vector operations).


Command line interface to and serialization format for Blosc, a high performance, multi-threaded, blocking and shuffling compressor. Uses python-blosc bindings to interface with Blosc. Also comes with native support for efficiently serializing and deserializing Numpy arrays.


Web sites are made of lots of things — frameworks, libraries, assets, utilities, and rainbows. Bower manages all these things for you.

Bower works by fetching and installing packages from all over, taking care of hunting, finding, downloading, and saving the stuff you’re looking for. Bower keeps track of these packages in a manifest file, bower.json. How you use packages is up to you. Bower provides hooks to facilitate using packages in your tools and workflows.

Bower is optimized for the front-end. Bower uses a flat dependency tree, requiring only one version for each package, reducing page load to a minimum.

A very useful thing is the search engine for packages that can be installed by Bower.


BRL-CAD is a powerful cross-platform open source solid modeling system that includes interactive geometry editing, high-performance ray-tracing for rendering and geometric analysis, image and signal-processing tools, a system performance analysis benchmark suite, libraries for robust geometric representation, with more than 20 years of active development.


Very efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication (among and within files, including virtual machine images).



ImplicitCAD is a project dedicated to using the power of math and computer science to get stupid design problems out of the way of the 3D printing revolution.


OpenSCAD is a software for creating solid 3D CAD models. It is free software and available for Linux/UNIX, Windows and Mac OS X. Unlike most free software for creating 3D models (such as Blender) it does not focus on the artistic aspects of 3D modelling but instead on the CAD aspects. Thus it might be the application you are looking for when you are planning to create 3D models of machine parts but pretty sure is not what you are looking for when you are more interested in creating computer-animated movies.

OpenSCAD is not an interactive modeller. Instead it is something like a 3D-compiler that reads in a script file that describes the object and renders the 3D model from this script file. This gives you (the designer) full control over the modelling process and enables you to easily change any step in the modelling process or make designs that are defined by configurable parameters.

OpenSCAD provides two main modelling techniques: First there is constructive solid geometry (aka CSG) and second there is extrusion of 2D outlines. As data exchange format format for this 2D outlines Autocad DXF files are used. In addition to 2D paths for extrusion it is also possible to read design parameters from DXF files. Besides DXF files OpenSCAD can read and create 3D models in the STL and OFF file formats.


Calaos is a free software project (GPLv3) that lets you control and monitor your home. You can easily install and use it to transform your home into a smart home.


Cartopy is a Python package designed to make drawing maps for data analysis and visualisation as easy as possible. Cartopy makes use of the powerful PROJ.4, numpy and shapely libraries and has a simple and intuitive drawing interface to matplotlib for creating publication quality maps.

The features of Cartopy include:

  • object oriented projection definitions

  • point, line, vector, polygon and image transformations between projections

  • integration to expose advanced mapping in matplotlib with a simple and intuitive interface

  • powerful vector data handling by integrating shapefile reading with Shapely capabilities


It is already common for simulations to discard most of what they compute in order to minimize time spent on I/O. As we enter the exascale age the problem of scarce I/O capability continues to grow. Since storing data is no longer viable for many simulation applications, data analysis and visualization must now be performed in situ with the simulation to ensure that it is running smoothly and to fully understand the results that the simulation produces. Catalyst is a light-weight version of the ParaView server library that is designed to be directly embedded into parallel simulation codes to perform in situ analysis at run time.


A C and Fortran Interface to access Climate and NWP model Data. Supported data formats are GRIB, netCDF, SERVICE, EXTRA and IEG.


Software that enables our collaborators to easily harness large scale distributed systems such as clusters, clouds, and grids. We perform fundamental computer science research in that enables new discoveries through computing in fields such as physics, chemistry, bioinformatics, biometrics, and data mining. The tools are:

  • Parrot - Parrot is a tool for attaching existing programs to remote I/O systems through the filesystem interface. Parrot "speaks" a variety of remote I/O services include HTTP, FTP, GridFTP, iRODS, HDFS, XRootD, GROW, and Chirp on behalf of ordinary programs.

  • Chirp - A user-level file system for collaboration across distributed systems such as clusters, clouds, and grids. Chirp allows ordinary users to discover, share, and access storage, whether within a single machine room or over a wide area network.

  • Makeflow - A workflow engine for executing large complex workflows on clusters, clouds, and grids. Makeflow is very similar to traditional Make, so if you can write a Makefile, then you can write a Makeflow.

  • Work Queue - A framework for building large master-worker applications that span many computers including clusters, clouds, and grids. Work Queue applications are written in C, Perl, or Python using a simple API that allows users to define tasks, submit them to the queue, and wait for completion. Tasks are executed by a standard worker process that can run on any available machine. Each worker calls home to the master process, arranges for data transfer, and executes the tasks. The system handles a wide variety of failures, allowing for dynamically scalable and robust applications.

  • SAND - A set of modules for genome assembly that are built atop the Work

  • Queue platform for large-scale distributed computation on clusters, clouds,

  • or grids.


A large tool set for working on climate and NWP model data. NetCDF 3/4, GRIB 1/2 including SZIP and JPEG compression, EXTRA, SERVICE and IEG are supported as IO-formats. Apart from that CDO can be used to analyse any kind of gridded data not related to climate science. CDO has very small memory requirements and can process files larger than the physical memory.

configure --enable-cdi-lib --with-fftw3 --with-jasper=/usr/lib64
--with-libxml2=yes --with-udunits2=/usr/lib64 --with-curl=/usr/lib64
--with-proj=/usr/lib64 --with-netcdf=yes --with-hdf5=yes --with-szlib=yes
--with-threads=yes --with-grib-api=yes


configure: CDO is configured with the following options:
   "CC"                 : "gcc -std=gnu99",
   "CPP"                : "gcc -E",
   "CPPFLAGS"           : "-I/usr/lib64/include -I/usr/lib64/include -I/usr/lib64/include -I/usr/lib64/include -I/usr/include/libxml2",
   "CFLAGS"             : "-g -O2 -fopenmp ",
   "LDFLAGS"            : "-L/usr/lib64/lib -L/usr/lib64/lib  -L/usr/lib64/lib -L/usr/lib64/lib",
   "LIBS"               : "-lxml2 -ludunits2 -lcurl -lproj -lfftw3 -lgrib_api -ljasper -lnetcdf -lhdf5_hl -lhdf5 -lsz -lz  -lm ",
   "FCFLAGS"            : "",
   "INCLUDES"           : "@INCLUDES@",
   "LD"                 : "/usr/bin/ld -m elf_x86_64",
   "NM"                 : "/usr/bin/nm -B",
   "AR"                 : "ar",
   "AS"                 : "as",
   "DLLTOOL"            : "false",
   "OBJDUMP"            : "objdump",
   "STRIP"              : "strip",
   "RANLIB"             : "ranlib",
   "INSTALL"            : "/usr/bin/install -c",
   "cdi"                : {
     "enable_cdi_lib" : true
  "threads"    : {
    "lib"      : "",
    "include"  : ""
  "zlib"       : {
    "lib"      : " -lz",
  "szlib"      : {
    "lib"      : " -lsz",
    "include"  : ""
  "hdf5"       : {
    "lib"      : " -lhdf5",
    "include"  : ""
  "netcdf"     : {
    "lib"      : " -lnetcdf",
    "include"  : ""
  "udunits2"   : {
    "lib"      : " -L/usr/lib64/lib -ludunits2",
    "include"  : " -I/usr/lib64/include"
  "proj"       : {
    "lib"      : " -L/usr/lib64/lib -lproj",
    "include"  : " -I/usr/lib64/include"
  "USER_NAME"          : "baum",
  "HOST_NAME"          : "max",
  "SYSTEM_TYPE"        : "x86_64-unknown-linux-gnu"


Cdo{rb,py} allows you to use CDO in the context of Python and Ruby as if it would be a native library.

CDSC Mapper ~~~

The CDSC Mapper is a compiler package for heterogeneous mapping on various targets such as multi-core CPUs, GPUs and FPGAs. The objective is to provide the user with a complete compilation platform to ease the programming of complex heterogeneous devices, such as a Convey HC1-ex machine. The architecture of the compiler is based on a collection of production-quality compilers such as GNU GCC, Nvidia GCC and LLVM; two open-source compilation infrastructures on top of which development has been performed: the LLNL ROSE compiler and the LLVM project; and a collection of research compilers and runtime such as CnC-HC, PolyOpt and SDSLc.


The CEOP Satellite Data Server is actually a gateway with an OPeNDAP front end and the ability to access data via the OGC WCS protocol on the backend. Though originally developed for the Coordinated Enhanced Observing Period (CEOP) effort, it can be used with other WCS servers. It is implemented as a plug-in handler to the Hyrax server distributed by OPeNDAP


Ceph is a free software storage platform designed to present object, block, and file storage from a single distributed computer cluster. Ceph’s main goals are to be completely distributed without a single point of failure, scalable to the exabyte level, and freely-available. The data is replicated, making it fault tolerant. Ceph software runs on commodity hardware. The system is designed to be both self-healing and self-managing and strives to reduce both administrator and budget overhead.


Cetus is a compiler infrastructure for the source-to-source transformation of software programs. It currently supports ANSI C. Since its creation in 2004, it has grown to over 80,000 lines of Java code, has been made available publicly on the web, and has become a basis for several research projects.

CFD Utilities

The CFD Utility Software Library (previously known as the Aerodynamics Division Software Library at NASA Ames Research Center) contains nearly 30 libraries of generalized subroutines and close to 100 applications built upon those libraries. These utilities have accumulated during four decades or so of software development in the aerospace field.

All are written in Fortran 90 or FORTRAN 77 with potential reuse in mind. The only exception is the C translations of a dozen or so numerics routines grouped as C_utilities.

David Saunders and Robert Kennelly are the primary authors, but miscellaneous contributions by others are gratefully acknowledged.

See 1-line summaries of the libraries and applications under the Files menu. Each library folder also contains 1-line summaries of the grouped subroutines, while each application folder contains READMEs adapted from the main program headers. NASA permission to upload actual software was granted on Jan. 24, 2014.


An I/O library for climate models, named CFIO(Climate Fast I/O).

CFIO provides the same interface and feature as PnetCDF, and adopts an I/O forwarding technique to provide automatic overlapping of I/O with computing. CFIO performs better than PnetCDF in terms of decreasing the overall running time of the program.


CF-compliant NetCDF for radial data.


The CGAL Bindings project allows to use some packages of CGAL, the Computational Algorithms Library, in languages other than C++, as for example Java and Python. The bindings are implemented with SWIG.


An emerging parallel programming language whose design and development are being led by Cray Inc. in collaboration with academia, computing centers, and industry. Chapel’s goal is to make parallel programming more productive, from high-end supercomputers to commodity clusters and multicore desktops and laptops. Chapel is being developed in an open-source manner at SourceForge and is released under the BSD license.

Chapel supports a multithreaded execution model via high-level abstractions for data parallelism, task parallelism, concurrency, and nested parallelism. Chapel’s locale type enables users to specify and reason about the placement of data and tasks on a target architecture in order to tune for locality. Chapel supports global-view data aggregates with user-defined implementations, permitting operations on distributed data structures to be expressed in a natural manner. In contrast to many previous higher-level parallel languages, Chapel is designed around a multiresolution philosophy, permitting users to initially write very abstract code and then incrementally add more detail until they are as close to the machine as their needs require. Chapel supports code reuse and rapid prototyping via object-oriented design, type inference, and features for generic programming.

Chapel was designed from first principles rather than by extending an existing language. It is an imperative block-structured language, designed to be easy to learn for users of C, C++, Fortran, Java, Python, Matlab, and other popular languages. While Chapel builds on concepts and syntax from many previous languages, its parallel features are most directly influenced by ZPL, High-Performance Fortran (HPF), and the Cray MTA™/Cray XMT™ extensions to C and Fortran.


A high-performance language interoperability tool that generates Babel-compatible bindings for the Chapel programming language. For details on using the command-line tool, please consult the BRAID man page and the Babel user’s guide.


This provides interoperability with Chapel in three forms:

  • Chapel code inlined in Python

  • Chapel code from source-files

  • Compile Chapel modules into Python modules


Interactive geometry software. Besides support for dynamic geometry, Cinderella.2 has many features that broaden the scope of the program to a wide variety of interaction scenarios. Compared to the old version of the program, two completely new parts were added: CindyLab, an environment for doing interactive physical experiments, and CindyScript, a high-level programming language that allows for fast, flexible and freely programmable interaction scenarios. Although each of the three parts of the program (geometry, physical simulation and scripting) can be used in a standalone manner, the programm unleashes its full power when all three parts are used in combination. They are designed to interact very smoothly.


Circos is a software package for visualizing data and information. It visualizes data in a circular layout — this makes Circos ideal for exploring relationships between objects or positions. There are other reasons why a circular layout is advantageous, not the least being the fact that it is attractive.

Circos is ideal for creating publication-quality infographics and illustrations with a high data-to-ink ratio, richly layered data and pleasant symmetries. You have fine control each element in the figure to tailor its focus points and detail to your audience.


CKAN is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data. CKAN is aimed at data publishers (national and regional governments, companies and organizations) wanting to make their data open and available.

CKAN is built with Python on the backend and Javascript on the frontend, and uses the Pylons web framework and SQLAlchemy as its ORM. Its database engine is PostgreSQL and its search is powered by SOLR. It has a modular architecture that allows extensions to be developed to provide additional features such as harvesting or data upload.

CKAN uses its internal model to store metadata about the different records, and presents it on a web interface that allows users to browse and search this metadata. It also offers a powerful API that allows third-party applications and services to be built around it.


This extension contains plugins that add geospatial capabilities to CKAN.


CL21 is an experimental project redesigning Common Lisp.


ClimatePipes uses a web-based application platform due to its widespread support on mainstream operating systems, ease-of-use, and inherent collaboration support. The front-end of ClimatePipes uses HTML5 (WebGL, CSS3) to deliver state-of-the-art visualization and to provide a best-in-class user experience. The back-end of the ClimatePipes is built using the Visualization Toolkit (VTK), Climate Data Analysis Tools (CDAT), and other climate and geospatial data processing tools such as GDAL and PROJ4.


CLFORTRAN is an open source (LGPL) Fortran module, designed to provide direct access to GPU, CPU and accelerator based computing resources available by the OpenCL standard.


clMAGMA is an OpenCL port of MAGMA. It supports AMD GPUs. The clMAGMA library dependancies, in particular optimized GPU OpenCL BLAS and CPU optimized BLAS and LAPACK for AMD hardware, can be found in the AMD Accelerated Parallel Processing Math Libraries (APPML).


Clojure is a dynamic programming language that targets the Java Virtual Machine (and the CLR, and JavaScript). It is designed to be a general-purpose language, combining the approachability and interactive development of a scripting language with an efficient and robust infrastructure for multithreaded programming. Clojure is a compiled language - it compiles directly to JVM bytecode, yet remains completely dynamic. Every feature supported by Clojure is supported at runtime. Clojure provides easy access to the Java frameworks, with optional type hints and type inference, to ensure that calls to Java can avoid reflection.

Clojure is a dialect of Lisp, and shares with Lisp the code-as-data philosophy and a powerful macro system. Clojure is predominantly a functional programming language, and features a rich set of immutable, persistent data structures. When mutable state is needed, Clojure offers a software transactional memory system and reactive Agent system that ensure clean, correct, multithreaded designs.


ClojureScript is a new compiler for Clojure that targets JavaScript. It is designed to emit JavaScript code which is compatible with the advanced compilation mode of the Google Closure optimizing compiler.


Leiningen is the easiest way to use Clojure. With a focus on project automation and declarative configuration, it gets out of your way and lets you focus on your code.


CLyther is a Python tool similar to Cython and PyPy. CLyther is a just-in-time specialization engine for OpenCL. The main entry points for CLyther are its clyther.task and clyther.kernel decorators. Once a function is decorated with one of these the function will be compiled to OpenCL when called.

CLyther is a Python language extension that makes writing OpenCL code as easy as Python itself. CLyther currently only supports a subset of the Python language definition but adds many new features to OpenCL.

CLyther exposes both the OpenCL C library as well as the OpenCL language to python.


CLUTO is a software package for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. CLUTO is well-suited for clustering data sets arising in many diverse application areas including information retrieval, customer purchasing transactions, web, GIS, science, and biology.

CLUTO’s distribution consists of both stand-alone programs and a library via which an application program can access directly the various clustering and analysis algorithms implemented in CLUTO.


gCLUTO is a cross-platform graphical application for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. gCLUTO is build on-top of the CLUTO clustering library.


wCLUTO is a web-enabled data clustering application that is designed for the clustering and data-analysis requirements of gene-expression analysis. wCLUTO is also built on top of the CLUTO clustering library. Users can upload their datasets, select from a number of clustering methods, perform the analysis on the server, and visualize the final results.


The "Climate Model Output Rewriter" (CMOR, pronounced "Seymour") comprises a set of C-based functions, with bindings to both Python and FORTRAN 90, that can be used to produce CF-compliant netCDF files that fulfill the requirements of many of the climate community’s standard model experiments. These experiments are collectively referred to as MIP’s and include, for example, AMIP, CMIP, CFMIP, PMIP, APE, and IPCC scenario runs. The output resulting from CMOR is "self-describing" and facilitates analysis of results across models.

Much of the metadata written to the output files is defined in MIP-specific tables, typically made available from each MIP’s web site. CMOR relies on these tables to provide much of the metadata that is needed in the MIP context, thereby reducing the programming effort required of the individual MIP contributors.


The software package to be disclosed,, is a front end to an existing free software package, CMOR2 (Climate Model Output Rewriter), written by Lawrence Livermore National Laboratory (LLNL), and reads in a multitude of standard data formats, such as netcdf3, netcdf4, Grads control files, Matlab data files or a list of netcdf files, and converts the data into the CMIP5 data format to allow publication on the Earth System Grid Federation (ESGF) data node.


Intel Concurrent Collections for C is a C template library for letting C++ programmers implement CnC applications which run in parallel on shared and distributed memory.

CnC makes it easy to write C++ programs which take full advantage of the available parallelism. Whether run on multicore systems, Xeon Phi™ or clusters CnC will seamlessly exploit the performance potential of your hardware. Through its portabilty and composability (with itself and other tools) it provides future-proof scalability.


The CnC-Python system under development in the Habanero project at Rice University builds on past work on the Intel Concurrent Collections (CnC) and Habanero CnC projects.


COCO (COmparing Continuous Optimisers) is a platform for systematic and sound comparisons of real-parameter global optimisers. COCO provides benchmark function testbeds and tools for processing and visualizing data generated by one or several optimizers. The COCO platform has been used for the Black-Box-Optimization-Benchmarking (BBOB) workshops that took place during the GECCO conference in 2009, 2010, 2012, and 2013.


Code::Blocks is a free C, C++ and Fortran IDE built to meet the most demanding needs of its users. It is designed to be very extensible and fully configurable. An IDE with all the features you need, having a consistent look, feel and operation across platforms. Built around a plugin framework, Code::Blocks can be extended with plugins. Any kind of functionality can be added by installing/coding a plugin. For instance, compiling and debugging functionality is already provided by plugins.


Code_Saturne solves the Navier-Stokes equations for 2D, 2D-axisymmetric and 3D flows, steady or unsteady, laminar or turbulent, incompressible or weakly dilatable, isothermal or not, with scalars transport if required.

Several turbulence models are available, from Reynolds-Averaged models to Large-Eddy Simulation models. In addition, a number of specific physical models are also available as "modules": gas, coal and heavy-fuel oil combustion, semi-transparent radiative transfer, particle-tracking with Lagrangian modeling, Joule effect, electrics arcs, weakly compressible flows, atmospheric flows, rotor/stator interaction for hydraulic machines.


The numerical solution of partial differential equations using the finite element method is one of the key applications of high performance computing. Local assembly is its characteristic operation. This entails the execution of a problem-specific kernel to numerically evaluate an integral for each element in the discretized problem domain. Since the domain size can be huge, executing efficient kernels is fundamental. Their op- timization is, however, a challenging issue. Even though affine loop nests are generally present, the short trip counts and the complexity of mathematical expressions make it hard to determine a single or unique sequence of successful transformations. Therefore, we present the design and systematic evaluation of COF- FEE, a domain-specific compiler for local assembly kernels. COFFEE manipulates abstract syntax trees generated from a high-level domain-specific language for PDEs by introducing domain-aware composable optimizations aimed at improving instruction-level parallelism, especially SIMD vectorization, and register locality. It then generates C code including vector intrinsics.


A Pythonic package for combinatorics. Combi lets you explore spaces of permutations and combinations as if they were Python sequences, but without generating all the permutations/combinations in advance. It lets you specify a lot of special conditions on these spaces. It also provides a few more classes that might be useful in combinatorics programming.


COMCOT (Cornell Multi-grid Coupled Tsunami Model) is a tsunami modeling package, capable of simulating the entire lifespan of a tsunami, from its generation, propagation and runup/rundown in coastal regions.

Waves can be generated via incident wave maker, fault model, landslide, or even customized profile. Flexible nested grid setup allows for the balance between accuracy and efficiency.

compressive sampling

Compressive sampling is a signal processing technique for efficiently acquiring and reconstructing a signal, by finding solutions to underdetermined linear systems. This is based on the principle that, through optimization, the sparsity of a signal can be exploited to recover it from far fewer samples than required by the Shannon-Nyquist sampling theorem. There are two conditions under which recovery is possible.[1] The first one is sparsity which requires the signal to be sparse in some domain. The second one is incoherence which is applied through the isometric property which is sufficient for sparse signals.



A fast and robust first-order method than solves basis-pursuit problems and a large number of extensions (including tv-denoising).


A set of Matlab templates, or building blocks, that can be used to construct efficient, customized solvers for a variety of convex models, including in particular those employed in sparse recovery applications.


ConicBundle is a callable library for C/C++ that implements a bundle method for minimizing the sum of convex functions that are given by first order oracles or arise from Lagrangean relaxation of particular conic linear programs.

Context Free Art

Context Free is a program that generates images from written instructions called a grammar. The program follows the instructions in a few seconds to create images that can contain millions of shapes. Chris Coyne created a small language for design grammars called CFDG. These grammars are sets of non-deterministic rules to produce images. The images are surprisingly beautiful, often from very simple grammars. Context Free is a full graphical environment for editing, rendering, and exploring CFDG design grammars.

See also Structure Synth.


A Fortran 90 library that provides functions to manage grids and aribirary sets of points, including interpolation and mapping between different coordinate systems.


The toolbox contains MATLAB® routines for computing recurrence plots and related problems.


The Cyclops Tensor Framework is a distributed-memory library that provides support for high-dimensional arrays (tensors). CTF arrays are distributed over MPI communicators and two-level parallelism (MPI + threads) is supported with via extensive internal usage of OpenMP and capability to exploit threaded BLAS effectively. CTF is capable of performing summation and contraction, as well as data manipulation and mapping.

CTF aims to provide support for distributed memory tensors (scalars, vectors, matrices, etc.). CTF provides summation and contraction routines in Einstein notation, so that any for loops are implicitly described by the index notation. The tensors in CTF are templated (only double and complex<double> currently tested), associated with an MPI communicator, and custom element-wise functions can be defined for contract and sum. A number of example codes using CTF are provided in the examples/ subdirectory. CTF uses hybrid parallelism with MPI and OpenMP, so please set OMP_NUM_THREADS appropriately.


Cubica is a toolkit for efficient finite element simulations of deformable bodies containing both geometric and material non-linearities. Its main feature is its use of subspace methods, also known as dimensional model reduction or reduced order methods, which can accelerate simulations by several orders of magnitude.


CUDA (after the Plymouth Barracuda[1]), which stands for Compute Unified Device Architecture, is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce.[2] CUDA gives developers direct access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.

Using CUDA, the GPUs can be used for general purpose processing (i.e., not exclusively graphics); this approach is known as GPGPU. Unlike CPUs, however, GPUs have a parallel throughput architecture that emphasizes executing many concurrent threads slowly, rather than executing a single thread very quickly.


This unofficial RPM repository provides the easiest way to install NVIDIA drivers - including CUDA - on the latest Redhat/Fedora derived Linux distributions.


A way to program NVIDIA graphical processors, from Fortran-9X and eventually Matlab.


The FortCUDA project seeks to generate CUDA bindings in F95/2003 using the Fortran 2003 ISO_C_BINDINGS module. It is intended to give near native call syntax to the CUDA SDK in Fortran 2003. Currently, most Fortran compilers are supporting the ISO_C_BINDINGS module.

The FortCUDA project mostly consists of a very basic module that contains appropriate bindings for most of the CUDA function calls found in cuda.h and cuda_runtime.h. This is not necessarily complete, but is quite comprehensive. The file parsing capability is provided as part of the distribution and has been used to generate wrappers around functions from a few projects. Specific to CUDA, wrappers have been generated for cuda.h and cuda_runtime.h and are included in the FortCUDA library. In addition Fortran modules have been generated for cublas.h and cufft.h, but are still being checked for accuracy. Theoretically the script can be used on any C header file, and, providing our grammars are adequate, will produce a 90% solution to wrapping the enums, strucs, and functions.


CUDA C/C++ and the NVIDIA NVCC compiler toolchain support a number of features designed to make it easier to write portable code, including language integration of host and device code and data, declaration specifiers (e.g. host and device) and preprocessor definitions (CUDACC). These features combine to enable developers to write code that can be compiled and run on either the host, the device, or both. Other compilers don’t recognize these features, however, so to really write portable code, we need preprocessor macros. This is where Hemi comes in.


NVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks. It emphasizes performance, ease-of-use, and low memory overhead. NVIDIA cuDNN is designed to be integrated into higher-level machine learning frameworks, such as UC Berkeley’s popular Caffe software. The simple, drop-in design allows developers to focus on designing and implementing neural net models rather than tuning for performance, while still achieving the high performance modern parallel computing hardware affords.


Cupid is a development and training environment for models that use the Earth System Modeling Framework (ESMF) and National Unified Operational Capability (NUOPC) Layer infrastructure. Cupid is implemented as a plug-in for the widely used Eclipse Integrated Development Environment (IDE). Together, Cupid and Eclipse form an accessible, appealing training environment that makes it easier and faster to build NUOPC-based applications.


A better microcontroller IDE.


D4M is a breakthrough in computer programming that combines the advantages of five distinct processing technologies (sparse linear algebra, associative arrays, fuzzy algebra, distributed arrays, and triple-store/NoSQL databases such as Hadoop HBase and Apache Accumulo) to provide a database and computation system that addresses the problems associated with Big Data. D4M significantly improves search, retrieval, and analysis for any business or service that relies on accessing and exploiting massive amounts of digital data. Evaluations have shown D4M to simultaneously increase computing performance and to decrease the effort required to build applications by as much as 100x. Improved performance translates into faster, more comprehensive services provided by companies involved in healthcare, Internet search, network security, and more. Less, and simplified, coding reduces development times and costs. Moreover, the D4M layered architecture provides a robust environment that is adaptable to various databases, data types, and platforms.


Damaris is a middleware for I/O and data management targeting large-scale, MPI-based HPC simulations. It initially proposed to dedicate cores for asynchronous I/O in multicore nodes of recent HPC platforms, with an emphasis on ease of integration in existing simulation, efficient resource usage (with the use of shared memory) and simplicity of extension through plugins.

Over the years, Damaris has evolved into a more elaborate system, providing the possibility to use dedicated cores or dedicated nodes to data processing and I/O. It proposes a seamless connection to the VisIt software to enable in situ visualization with minimum impact on run time. Damaris provides an extremely simple API and can be easily integrated in existing large-scale simulations.


The goal of Damsel project is to enable Exascale computational science aplications to interact conveniently and efficiently with storage through abstractions that match their data models.


Dart is a cohesive, scalable platform for building apps that run on the web (where you can use Polymer) or on servers (such as with Google Cloud Platform). Use the Dart language, libraries, and tools to write anything from simple scripts to full-featured apps.


DART is a community facility for ensemble DA developed and maintained by the Data Assimilation Research Section (DAReS) at the National Center for Atmospheric Research (NCAR). DART provides modelers, observational scientists, and geophysicists with powerful, flexible DA tools that are easy to implement and use and can be customized to support efficient operational DA applications. DART is a software environment that makes it easy to explore a variety of data assimiliation methods and observations with different numerical models and is designed to facilitate the combination of assimilation algorithms, models, and real (as well as synthetic) observations to allow increased understanding of all three. DART includes extensive documentation, a comprehensive tutorial, and a variety of models and observation sets that can be used to introduce new users or graduate students to ensemble DA. DART also provides a framework for developing, testing, and distributing advances in ensemble DA to a broad community of users by removing the implementation-specific peculiarities of one-off DA systems.

DART employs a modular programming approach to apply an Ensemble Kalman Filter which nudges the underlying models toward a state that is more consistent with information from a set of observations. Models may be swapped in and out, as can different algorithms in the Ensemble Kalman Filter. The method requires running multiple instances of a model to generate an ensemble of states. A forward operator appropriate for the type of observation being assimilated is applied to each of the states to generate the model’s estimate of the observation.


DASH is a C++ Template Library for Distributed Data Structures with Support for Hierarchical Locality for HPC and Data-Driven Science.

Exascale systems are scheduled to become available in 2018-2020 and will be characterized by extreme scale and a multilevel hierarchical organization. Efficient and productive programming of these systems will be a challenge, especially in the context of data-intensive applications. Adopting the promising notion of Partitioned Global Address Space (PGAS) programming the DASH project develops a data-structure oriented C template library that provides hierarchical PGAS-like abstractions for important data containers (multidimensional arrays, lists, hash tables, etc.) and allows a developer to control (and explicitly take advantage of) the hierarchical data layout of global data structures. In contrast to other PGAS approaches such as UPC, DASH does not propose a new language or require compiler support to realize global address space semantics. Instead, operator overloading and other advanced C features are used to provide the semantics of data residing in a global and hierarchically partitioned address space based on a runtime system with one-sided messaging primitives provided by MPI or GASNet. As such, DASH can co-exist with parallel programming models already in widespread use (like MPI) and developers can take advantage of DASH by incrementally replacing existing data structures with the implementation provided by DASH. Efficient I/O directly to and from the hierarchical structures and DASH-optimized algorithms such as map-reduce are also part of the project. Two applications from molecular dynamics and geoscience are driving the project and are adapted to use DASH in the course of the project.


Dat is an open source project that provides a streaming interface between every file format and data storage backend.


DataHub is a unified, managed, collaborative platform for making data-processing easy. Relational databases have limited support for data collaboration, where teams collaboratively curate and analyze large datasets. Inspired by software version control systems like git, we propose (a) a dataset version control system, giving users the ability to create, branch, merge, difference and search large, divergent collections of datasets, and (b) a platform, DataHub, that gives users the ability to perform collaborative data analysis building on this version control system.

Data Transfer and Storage


A scalable data transfer management tool for GridFTP? transfer protocol. The goal is to manage as much as 1+ PB with millions of files transfers reliably.


BeStMan is a full implementation of SRM v2.2, developed by Lawrence Berkeley National Laboratory, for disk based storage systems and mass storage systems such as HPSS. End users may have their own personal BeStMan that manages and provides an SRM interface to their local disks or storage systems. It works on top of existing disk-based unix file system, and has been reported so far to work on file systems such as NFS, PVFS, AFS, GFS, GPFS, PNFS, and Lustre. It also works with any existing file transfer service, such as gsiftp, http, https and ftp. It requires the minimal administrative efforts on the deployment and maintenance.


CASTOR, stands for the CERN Advanced STORage manager, is a hierarchical storage management (HSM) system developed at CERN used to store physics production files and user files. Files can be stored, listed, retrieved and accessed in CASTOR using command line tools or applications built on top of the different data transfer protocols like RFIO (Remote File IO), ROOT libraries, GridFTP and XROOTD. CASTOR manages disk cache(s) and the data on tertiary storage or tapes. Currently (2007) there are some 60 million files and about 7 petabyte of data in CASTOR.

CASTOR provides a UNIX like directory hierarchy of file names. The directories are always rooted /castor/ (the will be different in other CASTOR sites). The CASTOR name space can viewed and manipulated only through CASTOR client commands and library calls. OS commands like ls or mkdir will not work on CASTOR files. The CASTOR name space holds permanent tape residence of the CASTOR files, while the more volatile disk residence is only known to the stager, which is the disk cache management component in CASTOR. When accessing or modifying a CASTOR file, one must therefore always use a stager.


The DaviX project aims to provide a solution for optimized remote I/O, data management and large collections of file management over the WebDav (link is external), Amazon S3 (link is external) and HTTP (link is external) protocols. Davix is Multi-plateform, Open Source and is written in C++.


A system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods. Depending on the Persistency Model, dCache provides methods for exchanging data with backend (tertiary) Storage Systems as well as space management, pool attraction, dataset replication, hot spot determination and recovery from disk or node failures. Connected to a tertiary storage system, the cache simulates unlimited direct access storage space.


The dCache dccp client.


DataMover-Lite (DML) is a simple file transfer tool with graphical user interface which supports multi-protocol data movement.DML is available in both webstart and standalone version. Currently, DML supports http, https, ftp, gridftp, lahfs and scp. For GridFTP, DML also supports directory browsing and transferring.


The Disk Pool Manager (DPM) is a lightweight storage solution for grid sites. It offers a simple way to create a disk-based grid storage element and supports relevant protocols (SRM, gridFTP, RFIO) for file management and access. It focus on manageability (ease of installation, configuration, low effort of maintenance), while providing all required functionality for a grid storage solution (support for multiple disk server nodes, different space types, multiple file replicas in disk pools).


GridFTP is a high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks. The GridFTP protocol is based on FTP, the highly-popular Internet file transfer protocol. We have selected a set of protocol features and extensions defined already in IETF RFCs and added a few additional features to meet requirements from current data grid projects.

GridFTP also addresses the problem of incompatibility between storage and access systems. Previously, each data provider would make their data available in their own specific way, providing a library of access functions. This made it difficult to obtain data from multiple sources, requiring a different access method for each, and thus dividing the total available data into partitions. GridFTP provides a uniform way of accessing the data, encompassing functions from all the different modes of access, building on and extending the universally accepted FTP standard. FTP was chosen as a basis for it because of its widespread use, and because it has a well defined architecture for extensions to the protocol (which may be dynamically discovered).


An open source utility that provides fast incremental file transfer.


For transferring large, deep file trees, rsync will pause while it generates lists of files to process. Since Version 3, it does this pretty fast, but on sluggish filesystems, it can take hours or even days before it will start to actually exchange rsync data.

Second, due to various bottlenecks, rsync will tend to use less than the available bandwidth on high speed networks. Starting multiple instances of rsync can improve this significantly. However, on such transfers, it is also easy to overload the available bandwidth, so it would be nice to both limit the bandwidth used if necessary and also to limit the load on the system.

Parsyncfp tries to satisfy all these conditions and more by:

  • using the fpart file partitioner which can produce lists of files very rapidly

  • allowing re-use of the cache files so generated

  • doing crude loadbalancing of the number of active rsyncs, suspending and unsuspending the processes as necessary

  • using rsync’s own bandwidth limiter (--bwlimit) to throttle the total bandwidth

  • using rsync’s own vast option selection is available as a pass-thru


SRM-Lite is a simple command-line based tool with pluggable file transfer protocol supports. SRM-Lite supports scp and sftp in high performance way (hpn-ssh).


The XROOTD project aims at giving high performance, scalable fault tolerant access to data repositories of many kinds. The typical usage is to give access to file-based ones. It is based on a scalable architecture, a communication protocol, and a set of plugins and tools based on those. The freedom to configure it and to make it scale (for size and performance) allows the deployment of data access clusters of virtually any size, which can include sophisticated features, like authentication/authorization, integrations with other systems, WAN data distribution, etc.

XRootD software framework is a fully generic suite for fast, low latency and scalable data access, which can serve natively any kind of data, organized as a hierarchical filesystem-like namespace, based on the concept of directory. As a general rule, particular emphasis has been put in the quality of the core software parts.


DataMPI is an efficient, flexible, and productive communication library, which provides a set of key-value pair based communication interfaces that extends MPI for Big Data. Through utilizing the efficient communication technologies in the High-Performance Computing area, DataMPI can speedup the emerging data intensive computing applications. DataMPI takes a step in bridging the two fields of HPC and Big Data.

DataMPI can support multiple modes for various Big Data Computing applications, including Common, MapReduce, Streaming, and Iteration. The current version implements the functionalities and features of the Common mode, which aims to support the single program, multiple data (SPMD) applications. The remaining modes will be released in the future.

The current implementation of DataMPI is extending mpiJava. We also integrate some features from Hadoop under Apache License 2.0. The current evaluations of DataMPI use MVAPICH2 as the backend. DataMPI also supports other MPI implementations, such as MPICH2.


The DaviX project aims to provide a solution for optimized remote I/O, data management and large collections of file management over the WebDav (link is external), Amazon S3 (link is external) and HTTP (link is external) protocols. Davix is Multi-plateform, Open Source and is written in C++.

It is composed of two components:

  • libdavix: a C++ library. it offers an HTTP API, a remote I/O API and a POSIX compatibility layer.

  • davix-*: several utilities for file transfert, large collections of files management and large files management.

DaviX supports features like session reuse, redirection caching, vector operations, Metalink, X509 client certificate, proxy certificate, SOCKS4/5 or VOMS.

Dax Toolkit

The Dax Toolkit supports the fine-grained concurrency for data analysis and visualization algorithms required to drive exascale computing. The basic computational unit of the Dax Toolkit is a worklet, a function that implements the algorithm’s behavior on an element of a mesh (that is, a point, edge, face, or cell) or a small local neighborhood. The worklet is constrained to be serial and stateless; it can access only the element passed to and from the invocation. With this constraint, the serial worklet function can be concurrently executed on an unlimited number of threads without the complications of memory clashes or other race conditions.

The Dax Toolkit provides dispatchers that apply worklets to all elements in an input mesh, the results of which are collected into a resulting mesh. Although worklets are not allowed communication, many visualization algorithms require operations such as variable array packing and coincident topology resolution that intrinsically require significant coordination among threads. Dax enables such algorithms by classifying and implementing the most common and versatile communicative operations, which, when used in conjunction with the appropriate worklets, complete the visualization algorithms.


DCCRG is an easy to use grid for FVM/FEM simulations written in C++. It handles load balancing and neighbour cell data updates between processes automatically. MPI is used for parallelization.

The distributed cartesian cell-refinable grid (dccrg) supports adaptive mesh refinement and allows an arbitrary C++ class to be used as cell data. The amount of data in grid cells can vary both in space and time allowing dccrg to be used in very different types of simulations, for example in fluid and particle codes. Dccrg transfers the data between neighboring cells on different processes transparently and asynchronously allowing one to overlap computation and communication. This enables excellent scalability at least up to 32 k cores in magnetohydrodynamic tests depending on the problem and hardware. In the version of dccrg presented here part of the mesh metadata is replicated between MPI processes reducing the scalability of adaptive mesh refinement (AMR) to between 200 and 600 processes.


DGGRID is a public domain software program for creating and manipulating Discrete Global Grids. A Discrete Global Grid (DGG) consists of a set of regions that form a partition of the Earth’s surface, where each region has a single point contained in the region associated with it. Each region/point combination is a called a cell. Depending on the application, data objects or values may be associated with the regions, points, or cells of a DGG. A Discrete Global Grid System (DGGS) is a series of discrete global grids, usually consisting of increasingly finer resolution grids (though the term DGG is often used interchangeably with the term DGGS).


We introduce the Declaratron, a system which takes a declarative approach to specifying mathematically based scientific computation. This uses displayable mathematical notation (Content MathML) and is both executable and semantically well defined. We combine domain specific representations of physical science (e.g. CML, Chemical Markup Language), MathML formulae and computational specifications (DeXML) to create executable documents which include scientific data and mathematical formulae. These documents preserve the provenance of the data used, and build tight semantic links between components of mathematical formulae and domain objects---in effect grounding the mathematical semantics in the scientific domain.


In this paper it is suggested that a stochastic isotropic diffusive process, representing a spatial first order auto regressive process (AR(1)-process), can be used as a null hypothesis for the spatial structure of climate variability. By comparing the leading empirical orthogonal functions (EOFs) of a fitted null hypothesis with EOF modes of an observed data set, inferences about the nature of the observed modes can be made. The concept and procedure of fitting the null hypothesis to the observed EOFs is in analogy to time analysis, where an AR(1)-process is fitted to the statistics of the time series in order to evaluate the nature of the time scale behavior of the time series. The formulation of a stochastic null hypothesis allows one to define teleconnection patterns as those modes that are most distinguished from the stochastic null hypothesis. The method is applied to several artificial and real data sets including the sea surface temperature of the tropical Pacific and Indian Ocean and the Northern Hemisphere wintertime and tropical sea level pressure.

A Matlab script for computing the Distinct EOFs is available.


DEPOT is a framework for easily storing and serving files in web applications on Python2.6+ and Python3.2+. Modern web applications need to rely on a huge amount of stored images, generated files and other data which is usually best to keep outside of your database. DEPOT provides a simple and effective interface for storing your files on a storage backend at your choice (Local, S3, GridFS) and easily relate them to your application models (SQLAlchemy, Ming) like you would for plain data.


Delite is a research project from Stanford University’s Pervasive Parallelism Laboratory (PPL). Delite is a compiler framework and runtime for parallel embedded domain-specific languages (DSLs). Our goal is enable the rapid construction of high performance, highly productive DSLs.

Delite is still in alpha, and there is no official release. However, the develop (Delite) and delite-develop (LMS) branches should be relatively stable for experimental development of new DSLs. For those interested in developing their own DSLs, we highly recommend using Forge, which is itself a DSL that automates much of the process of creating DSLs embedded in Scala. For those interested in using instead of building DSLs, alpha builds of OptiML, a DSL for machine learning, OptiQL, a DSL for data querying, and OptiGraph, a DSL for graph analytics, are currently available.


OptiML is an embedded domain-specific language for machine learning. OptiML is developed as a research project from Stanford University’s Pervasive Parallelism Laboratory (PPL).

OptiML is currently targeted at machine learning researchers and algorithm developers; it aims to provide a productive, high performance, MATLAB-like environment for linear algebra supplemented with machine learning specific abstractions. Our primary goal is to allow machine learning practitioners to write code in a highly declarative manner and still achieve high performance on a variety of underlying parallel, heterogeneous devices. The same OptiML program should run well and scale on a CMP (chip multi-processor), a GPU, a combination of CMPs and GPUs, clusters of CMPs and GPUs, and eventually even FPGAs and other specialized accelerators.

In particular, OptiML is designed to allow statistical inference algorithms expressible by the Statistical Query Model to be both easy to express and very fast to execute. These algorithms can be expressed in a summation form, and can be parallelized using fine-grained map-reduce operations. OptiML employs aggressive optimizations to reduce unnecessary memory allocations and fuse operations together to make these as fast as possible. OptiML also attempts to specialize implementations to particular hardware devices as much as possible to achieve the best performance.


A prototype meta DSL that generates Delite DSL implementations from a specification-like program.


Codes for detrending Kepler and other light curves. To study exoplanetary atmospheres, we typically require a 10e-4 to 10e-5 level of accuracy in flux. Achieving such a precision has become the central challenge to exoplanetary research and is often impeded by systematic (nongaussian) noise from either the instrument, stellar activity or both. Dedicated missions, such as Kepler, feature an a priori instrument calibration plan to the required accuracy but nonetheless remain limited by stellar systematics. More generic instruments often lack a sufficiently defined instrument response function, making it very hard to calibrate. The correct calibration strategy is hence of paramount importance and requires a dedicated effort and out of the box thinking. In recent years, we have made significant advances in exoplanetary spectroscopy through improvements in data de-trending.

See xref:PyKE.


DistAlgo is a very high-level language for programming distributed algorithms. This project implements a DistAlgo compiler with Python as the target language. In the following text, the name DistAlgo refers to the compiler and not the language.

Distributed Array Protocol

The Distributed Array Protocol (DAP) is a process-local protocol that allows two subscribers, called the “producer” and the “consumer” or the “exporter” and the “importer”, to communicate the essential data and metadata necessary to share a distributed-memory array between them. This allows two independently developed components to access, modify, and update a distributed array without copying. The protocol formalizes the metadata and buffers involved in the transfer, allowing several distributed array projects to collaborate, facilitating interoperability. By not copying the underlying array data, the protocol allows for efficient sharing of array data.


A tool for multidimensional variational analysis (divand) is presented. It allows the interpolation and analysis of observations on curvilinear orthogonal grids in an arbitrary high dimensional space by minimizing a cost function. This cost function penalizes the deviation from the observations, the deviation from a first guess and abruptly varying fields based on a given correlation length (potentially varying in space and time). Additional constraints can be added to this cost function such as an advection constraint which forces the analysed field to align with the ocean current. The method decouples naturally disconnected areas based on topography and topology.


D-LITe is an universal architecture for building simple application over heterogenous Sensors Networks.


A Matlab implementation of the Sparsity-Promoting Dynamic Mode Decomposition (DMDSP) algorithm. Dynamic Mode Decomposition (DMD) is an effective means for capturing the essential features of numerically or experimentally generated snapshots, and its sparsity-promoting variant DMDSP achieves a desirable tradeoff between the quality of approximation (in the least-squares sense) and the number of modes that are used to approximate available data. Sparsity is induced by augmenting the least-squares deviation between the matrix of snapshots and the linear combination of DMD modes with an additional term that penalizes the ell_1-norm of the vector of DMD amplitudes. We employ alternating direction method of multipliers (ADMM) to solve the resulting convex optimization problem and to efficiently compute the globally optimal solution.


Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications. Consisting of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows, Docker enables apps to be quickly assembled from components and eliminates the friction between development, QA, and production environments. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud.


A package for deploying and managing Docker containers. Project Atomic integrates the tools and patterns of container-based application and service deployment with trusted operating system platforms to deliver an end-to-end hosting architecture that’s modern, reliable, and secure.

An Atomic Host is a lean operating system designed to run Docker containers, built from upstream CentOS, Fedora, or Red Hat Enterprise Linux RPMs. It provides all the benefits of the upstream distribution, plus the ability to perform atomic upgrades and rollbacks.

Docker Machine

A tool that makes it really easy to go from “zero to Docker”. Machine creates Docker Engines on your computer, on cloud providers, and/or in your data center, and then configures the Docker client to securely talk to them.


An innovative feature of DOpElib is to provide a software toolkit to solve forward PDE problems as well as optimal control problems constrained by PDE. DOpElib concentrates on a unified approach for both linear and nonlinear problems by interpreting every PDE problem as nonlinear and applying a Newton method to solve it. The focus is on the numerical solution of both stationary and nonstationary problems which come from diㄦent application fields, like elasticity and plasticity, uid dynamics, and multiphysics problems such as uid-structure interactions.


A software experimental platform, providing the foundations needed to develop dedicated modular applications aggregating functionalities embedded using low-level and interchangeable software entities - plugins - and orchestrated through high-level software entities - scripts or GUIs.


Platform for multi-physics simulations built on top of dtk.


DxTer is a system for researching Design by Transformation (DxT). DxTer takes as input a knowledge base of software design transformations and a graph representing functionality to be implemented. It generates a search space of implementations from the transformations, estimates their costs, and outputs the best code.

DxT can be used as a principled way to re-engineer legacy applications or to forward engineer new applications. (The former is used for domains containing a single application, whereas the latter is used when transformations describe a combinatorial number of derivable applications). Our primary application is to forward engineer dense linear algebra applications using the Flame methodology and Elemental library. Our goal is to express the design knowledge of Elemental as transformations, and to generate Elemental libraries for new architectures, rather than hand-deriving such libraries. Doing so will be a significant accomplishment — both in software engineering in general, and dense linear algebra in particular.


DZSlides is a one-page-template to build your presentation in HTML5 and CSS3.

Earth Orbit

An astronomically precise and accurate model that offers 3-D visualizations of Earth’s orbital geometry, Milankovitch parameters and the ensuing insolation forcing. The model is developed in MATLAB® as a user-friendly graphical user interface. Users are presented with a choice between the Berger (1978a) and Laskar et al. (2004) astronomical solutions for eccentricity, obliquity and precession. A "demo" mode is also available, which allows the Milankovitch parameters to be varied independently of each other, so that users can isolate the effects of each parameter on orbital geometry, the seasons, and insolation. A 3-D orbital configuration plot, as well as various surface and line plots of insolation and insolation anomalies on various time and space scales are produced. Insolation computations use the model’s own orbital geometry with no additional a priori input other than the Milankovitch parameter solutions.


EAVL is the Extreme-scale Analysis and Visualization Library.


An integrated development environment (IDE). It contains a base workspace and an extensible plug-in system for customizing the environment. Written mostly in Java, Eclipse can be used to develop applications. By means of various plug-ins, Eclipse may also be used to develop applications in other programming languages: Ada, ABAP, C, C, COBOL, Fortran, Haskell, JavaScript, Lasso, Lua, Natural, Perl, PHP, Prolog, Python, R, Ruby (including Ruby on Rails framework), Scala, Clojure, Groovy, Scheme, and Erlang. It can also be used to develop packages for the software Mathematica. Development environments include the Eclipse Java development tools (JDT) for Java and Scala, Eclipse CDT for C/C and Eclipse PDT for PHP, among others.


Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.


A new tool for recording and replaying energy harvesting conditions. Energy harvesting is a necessity for many small, embedded sensing devices, that must operate maintenance-free for long periods of time. However, understanding how the environment changes and it’s effects on device behavior has always been a source of frustration. Ekho allows system designers working with ultra low power devices, to realistically predict how new hardware and software configurations will perform before deployment. By taking advantage of electrical characteristics all energy sources share, Ekho is able to emulate many different energy sources (e.g., Solar, RF, Thermal, and Vibrational) and takes much of the guesswork out of experimentation with tiny, energy harvesting sensing systems.


ELCIRC is an unstructured-grid model designed for the effective simulation of 3D baroclinic circulation across river-to-ocean scales. It uses a finite-volume/finite-difference Eulerian-Lagrangian algorithm to solve the shallow water equations, written to realistically address a wide range of physical processes and of atmospheric, ocean and river forcings. The numerical algorithm is low-order, but volume conservative, stable and computationally efficient. It also naturally incorporates wetting and drying of tidal flats. While originally developed to meet specific modeling challenges for the Columbia River, ELCIRC has been extensively tested against standard ocean/coastal benchmarks, and is starting to be applied to estuaries and continental shelves around the world.


A functional reactive language for interactive applications. Elm is great for 2D and 3D games, diagrams, widgets, and websites.


The eLesson Markup Language (eLML) is an testopen source XML framework for creating structured eLessons using XML. For easier lesson authoring eLML we offer the web-based WYSIWYG Firedocs eLML Editor and to create eLML template layouts withouth any XSLT-knowledge you can use our new Template Builder. Once you created your eLML-lesson you can transform it into many different output-formats like IMS Content Package or SCORM, various HTML-templates, eBooks (ePub format), PDF, Office-Document (ODF) and many more listed under "Output Formats".

El Topo

El Topo is a public domain C++ package for tracking dynamic surfaces represented as triangle meshes in 3D. It robustly handles topology changes such as merging and pinching off, while adaptively maintaining a tangle-free, high-quality triangulation.

The current release contains source for the El Topo library, as well as Talpa, an executable demonstrating several applications of our method. The code has been tested on OS/X and Linux and is freely available for download.


Embree is a collection of high-performance ray tracing kernels, developed at Intel. The target user of Embree are graphics application engineers that want to improve the performance of their application by leveraging the optimized ray tracing kernels of Embree. The kernels are optimized for photo-realistic rendering on the latest Intel® processors with support for SSE, AVX, AVX2, and the 16-wide Intel® Xeon Phi™ coprocessor vector instructions. Embree supports runtime code selection to choose the traversal and build algorithms that best matches the instruction set of your CPU. We recommend using Embree through its API to get the highest benefit from future improvements. Embree is released as Open Source under the Apache 2.0 license.


EMPIRE is the name given to a way of changing the source code of a dynamical model so that it can interface with sequential data assimilation methods.

EMPIRE should be one of the quickest and easiest ways in which to modify the source code of the model to use data assimilation.


Emscripten is an LLVM-based project that compiles C and C into highly-optimizable JavaScript in asm.js format. This lets you run C and C on the web at near-native speed, without plugins.


Equelle is a domain-specific language for the specification of simulators for systems of PDEs through a high-level syntax. The language allows the user to focus on equations and numerics while hiding the low-level details of software and hardware implementations.


The Federation of Earth Science Information Partners (ESIP) is a broad-based, distributed community of data and information technology practitioners.


Our proposed work consists of three thrust areas that address these contemporary challenges. First, we will provide high performance I/O middleware that makes effective use of computational platforms, researching a number of optimization strategies and deploying them through the HDF5 software. Second, we will improve the productivity of application developers by hiding the complexity of parallel I/O via new auto-tuning and transparent data re-organization techniques, and by extending our existing work in easy-to-use, high-level APIs that expose scientific data models. Third, we will facilitate scientific analysis for users by extending query-based techniques, developing novel in situ analysis capabilities, and making sure that visualization tools use best practices when reading HDF5 data.


The central goal of ExaStencils is to develop a radically new software technology for applications with exascale performance. To reach this goal, the project focusses on a comparatively narrow but very important application domain. The aim is to enable a simple and convenient formulation of problem solutions in this domain. The software technology developed in ExaStencils shall facilitate the highly automatic generation of a large variety of efficient implementations via the judicious use of domain-specific knowledge in each of a sequence of optimization steps such that, at the end, exascale performance results.

The application domain chosen is that of stencil codes, i.e., compute-intensive algorithms in which data points in a grid are redefined repeatedly as a combination of the values of neighboring points. The neighborhood pattern used is called a stencil. Stencils codes are used for the solution of discrete partial differential equations and the resulting linear systems.


An EXpression Capturing Finite Element Library is a library developed during my PhD as a means to explore the benefits of using active library techniques for the performance optimisation of finite-element simulations. In particular active library techniques facilitate efficient implementations of domain specific languages.

Excafé only supports triangular meshes with Lagrange basis functions at present. Furthermore boundary integrals have not yet been implemented. However, the functionality present is more than sufficient to implement an incompressible Navier-Stokes solver, which is included in the distribution.

One topic that Excafé has been used to explore is the symbolic analysis of the expressions in finite element local assembly matrices. Excafé has access to run-time representations of variational forms and basis functions. It uses this to build symbolic representations of each entry of the local assembly matrix. Once it has these, it uses a common sub-expression elimination pass targeted at polynomial evaluation to find an evaluation strategy for these expressions that minimizes operation count.


EZFIO is the Easy Fortran I/O library generator. It generates automatically an I/O library from a simple configuration file. The produced library contains Fortran subroutines to read/write the data from/to disk, and to check if the data exists. A Python and an Ocaml API are also provided.

With EZFIO, the data is organized in a file system inside a main directory. This main directory contains subdirectories, which contain files. Each file corresponds to a data. For atomic data the file is a plain text file, and for array data the file is a gzipped text file.


The EvoGrid is a worldwide, cross-disciplinary effort to create an abstract, yet plausible simulation of the chemical origins of life on Earth. One could think of this as an artificial origin of life experiment. Our strategy is to employ a large number of computers in a grid to simulate a digital primordial soup along with a distributed set of computers acting as observers looking into that grid. These observers, modeled after the very successful @Home scientific computation projects, will be looking for signs of emergent complexity and reporting back to the central grid.


The Factor programming language combines powerful language features with a full-featured library. The implementation is fully compiled for performance, while still supporting interactive development. Factor applications are portable between all common platforms. Factor can deploy stand-alone applications on all platforms.

Factor belongs to the family of concatenative languages: this means that, at the lowest level, a Factor program is a series of words (functions) that manipulate a stack of references to dynamically-typed values. This gives the language a powerful foundation which allows many abstractions and paradigms to be built on top.


A library of math functions targeted at 32-bit and 64-bit x86 Linux systems. The purpose of this library is to provide faster drop in replacements for selected functions of the standard math library libm. These functions are written so they can be more optimized by compilers and all special case tests for increased consistency and accuracy have been removed. They are based on the corresponding implementations from the Cephes math library by Stephen L. Moshier. The code has been simplified perusing internal compiler facilities wherever possible and assuming little endian IEEE-754 single and double precision math.


A C++ parallel programming framework advocating high-level, pattern-based parallel programming. It chiefly supports streaming and data parallelism, targeting heterogenous platforms composed of clusters of shared-memory platforms, possibly equipped with computing accelerators such as NVidia GPGPUs, Xeon Phi, Tilera TILE64.

FastFlow comes as a C template library designed as a stack of layers that progressively abstracts out the programming of parallel applications. The goal of the stack is threefold: portability, extensibility, and performance. For this, all the three layers are realised as thin strata of C templates that are 1) seamlessly portable; 2) easily extended via subclassing; and 3) statically compiled and cross-optimised with the application. The terse design ensures easy portability on almost all OSes and CPUs with a C++ compiler.


The FASTMathSciDAC Institute develops and deploys scalable mathematical algorithms and software tools for reliable simulation of complex physical phenomena and collaborates with application scientists to ensure the usefulness and applicability of FASTMath technologies.


FAUST (Functional Audio Stream) is a functional programming language specifically designed for real-time signal processing and synthesis. FAUST targets high-performance signal processing applications and audio plug-ins for a variety of platforms and standards.


This project provides an LV2 plugin architecture for the Faust programming language. The package contains the Faust architecture and templates for the needed LV2 manifest (ttl) files, a collection of sample plugins written in Faust, and a generic GNU Makefile for compiling the plugins.


A virtual guitar amplifier for Linux running with jack (Jack Audio Connection Kit). It takes the signal from your guitar as any real amp would do: as a mono-signal from your sound card. Your tone is processed by a main amp and a rack-section. Both can be routed separately and deliver a processed stereo-signal via Jack. You may fill the rack with effects from more than 25 built-in modules spanning from a simple noise-gate to brain-slashing modulation-fx like flanger, phaser or auto-wah. Your signal is processed with minimum latency. On any properly set-up Linux-system you do not need to wait for more than 10 milli-seconds for your playing to be delivered, processed by guitarix. It offers the range of sounds you would expect from a full-featured universal guitar-amp. A great part of guitarix effects is written in Faust.


FBReader is a free (and ad-free) multi-platform ebook reader. It provides access to popular network libraries that contain a large set of ebooks. Download books for free or for a fee. Add your own catalog. It supports popular ebook formats: ePub, fb2, mobi, rtf, html, plain text, and a lot of other formats. It is highly customizable. Choose colors, fonts, page turning animations, dictionaries, bookmarks, etc. to make reading as convenient as you want.


The FEAST solver package is a free high-performance numerical library for solving the standard or generalized eigenvalue problem, and obtaining all the eigenvalues and eigenvectors within a given search interval. It is based on an innovative fast and stable numerical algorithm — named the FEAST algorithm — which deviates fundamentally from the traditional Krylov subspace iteration based techniques (Arnoldi and Lanczos algorithms) or other Davidson-Jacobi techniques. The FEAST algorithm takes its inspiration from the density-matrix representation and contour integration technique in quantum mechanics. It is free from explicit orthogonalization procedures, and its main computational tasks consist of solving very few inner independent linear systems with multiple right-hand sides and one reduced eigenvalue problem orders of magnitude smaller than the original one. The FEAST algorithm combines simplicity and efficiency and offers many important capabilities for achieving high performance, robustness, accuracy, and scalability on parallel architectures.

This general purpose FEAST solver package includes both reverse communication interfaces and ready to use predefined interfaces for dense, banded and sparse systems. It includes double and single precision arithmetic, and all the interfaces are compatible with Fortran (77,90) and C. FEAST is both a comprehensive library package, and an easy to use software. This solver is expected to significantly augment numerical performances and capabilities in large-scale modern applications.

Fedora Playground Repository

The Playground repository gives contributors a place to host packages that are not up to the standards of the main Fedora repository but may still be useful to other users. For now the Playground repository contains both packages that are destined for eventual inclusion into the main Fedora repository and packages that are never going to make it there. Users of the repository should be willing to endure a certain amount of instability when using packages from there.



Library fourpack provides a conveninent and uniform interface to Fast Fourier Transform impelemted in Intel Mathemtatical Kernel Library and FFTW. This contains routines that:

  • implements an interface to really fast FFT libraries, FFTW and Intel Math Kernel Library;

  • perform linear filtration with the use of FFT;

  • compute direct and inverse spherical function transform; and

  • compute tuning parameters for FFTW.

This requires petools.


The NFFT (nonequispaced fast Fourier transform or nonuniform fast Fourier transform) is a C subroutine library for computing the nonequispaced discrete Fourier transform (NDFT) and its generalisations in one or more dimensions, of arbitrary input size, and of complex data.


A Python interface for NFFT.


When the data is irregular in either the "physical" or "frequency" domain, unfortunately, the FFT does not apply. Over the last twenty years, a number of algorithms have been developed to overcome this limitation - generally referred to as non-uniform FFTs (NUFFT), non-equispaced FFTs (NFFT) or unequally-spaced FFTs (USFFT). They achieve the same O(N log N) computational complexity, but with a larger, precision-dependent, and dimension-dependent constant.

We have developed some NUFFT libraries in Fortran 77 and Fortran 90 that are freely available under the GPL license.


OpenFFT is an open source parallel package for computing three-dimensional Fast Fourier Transforms (3-D FFTs) of both real and complex numbers of arbitrary input size. It originates from OpenMX (Open source package for Material eXplorer). OpenFFT adopts a communication-optimal domain decomposition method that is adaptive and capable of localizing data when transposing from one dimension to another for reducing the total volume of communication. It is written in C and MPI, with support for Fortran through the Fortran interface, and employs FFTW3 for computing 1-D FFTs.


Parallel Three-Dimensional Fast Fourier Transforms is a library for large-scale computer simulations on parallel platforms. 3D FFT is an important algorithm for simulations in a wide range of fields, including studies of turbulence, climatology, astrophysics and material science.


A parallel FFT software library based on MPI.


A parallel software library for the calculation of three-dimensional nonequispaced FFTs based. It is available under GPL licence. The parallelization is based on MPI. PNFFT depends on the PFFT and FFTW software library.


The Sparse Fast Fourier Transform is a recent algorithm developed by Hassanieh et al. [2, 3] for computing the the discrete Fourier Transforms on signals with a sparse (exact or approximately) frequency domain. The algorithm improves the asymptotic runtime compared to the prior methods based on pruning.


Feelpp is a Cpp library for partial differential equation solves using generalized Galerkin methods such as the finite element method, the h/p finite element method, the spectral element method or the reduced basis method.


Fiber ViewerLight is an open-source C++ application to analyze fiber bundles. FiberViewerLight is now available as a 3D Slicer extension.


A library for fast computation of Gauss transforms in multiple dimensions, using the Improved Fast Gauss Transform and Approximate Nearest Neighbor searching. This software allows for efficient computation of probabilities by Kernel Density Estimation (KDE), and can reduce complexity of algorithms commonly used in Computer Vision, Machine Learning, etc, that must evaluate the Gauss transform.

file systems


The advanced multi layered unification filesystem implements a union mount for Linux file systems.


A network file system based on HTTP and optimized to deliver experiment software in a fast, scalable, and reliable way. Files and file metadata are aggressively cached and downloaded on demand. Thereby the CernVM-FS decouples the life cycle management of the application software releases from the operating system.


Chirp is a user-level file system for collaboration across distributed systems such as clusters, clouds, and grids. Chirp allows ordinary users to discover, share, and access storage, whether within a single machine room or over a wide area network.

Chirp requires no special privileges. Unlike most standard filesystems or storage services, Chirp does not require root access, kernel changes, special modules, or anything like that. It can be run by ordinary users to export ordinary filesystems on any machine or port that you like.

Chirp is transparent. When used with Parrot or FUSE, Chirp servers can be transparently attached to existing ordinary applications — like tcsh, vi, and perl — without any sort of kernel changes or special privileges. Chirp is designed to give maximum compatibility with standard Unix semantics.

Chirp is easy to deploy. Chirp is designed to be deployed with a minimum of fuss. One simple command starts a Chirp server or a Chirp client. There is no complex configuration, installation, or setup to mess up. It just works. This makes Chirp ideal for on-the-fly storage management in batch computing and grid computing environments.


Filesystem in Userspace (FUSE) is an operating system mechanism for Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in user space while the FUSE module provides only a "bridge" to the actual kernel interfaces.

FUSE is particularly useful for writing virtual file systems. Unlike traditional file systems that essentially save data to and retrieve data from disk, virtual filesystems do not actually store data themselves. They act as a view or translation of an existing file system or storage device.


CloudFusion lets you access a multitude of cloud storages from Linux like any file on your desktop. Work with files from Dropbox, Sugarsync, Amazon S3, Google Storage, Google Drive, and WebDAV storages like any other file on your desktop.


GlusterFS is an open source, distributed file system capable of scaling to several petabytes (actually, 72 brontobytes!) and handling thousands of clients. GlusterFS clusters together storage building blocks over Infiniband RDMA or TCP/IP interconnect, aggregating disk and memory resources and managing data in a single global namespace. GlusterFS is based on a stackable user space design and can deliver exceptional performance for diverse workloads.


The goal of the present HTTPFS project is to enable access to remote files, directories, and other containers (e.g., structured text documents, OS tables) through an HTTP pipe. HTTPFS system permits retrieval, creation and modification of these resources as if they were regular files and directories on a local filesystem. The remote host can be any UNIX or Win9x/WinNT box that is capable of running a Perl CGI script, and accessible either directly or via a web proxy or a gateway. HTTPFS runs entirely in user space. The current implementation fully supports reading as well as creating, writing, appending, and truncating of files on a remote HTTP host. HTTPFS provides an isolation level for concurrent file access stronger than the one mandated by POSIX file system semantics, closer to that of AFS. Both a programmatic interface with familiar open(), read(), write(), close(), etc. calls, and an interactive interface, via the popular Midnight Commander file browser, are provided.


An open-source, parallel file system that provides a POSIX compliant file system interface, can scale to thousands of clients, petabytes of storage and hundreds of gigabytes per second of I/O bandwidth. The key components of the Lustre file system are the Metadata Servers (MDS), the Metadata Targets (MDT), Object Storage Servers (OSS), Object Server Targets (OST) and the Lustre clients.

The ability of a Lustre file system to scale capacity and performance for any need reduces the need to deploy many separate file systems, such as one for each compute cluster. Storage management is simplified by avoiding the need to copy data between compute clusters. In addition to aggregating storage capacity of many servers, the I/O throughput is also aggregated and scales with additional servers. Moreover, throughput and/or capacity can be easily increased by adding servers dynamically.


Parrot is a tool for attaching old programs to new storage systems. Parrot makes a remote storage system appear as a file system to a legacy application. Parrot does not require any special privileges, any recompiling, or any change whatsoever to existing programs. It can be used by normal users doing normal tasks.

Parrot is useful to users of distributed systems, because it frees them from rewriting code to work with new systems and relying on remote administrators to trust and install new software. Parrot is also useful to developers of distributed systems, because it allows rapid deployment of new code to real applications and real users that do not have the time, inclination, or permissions to build a kernel-level filesystem.

Parrot "speaks" a variety of remote I/O services include HTTP, FTP, GridFTP, iRODS, HDFS, XRootD, GROW, and Chirp on behalf of ordinary programs. It works by trapping a program’s system calls through the ptrace debugging interface, and replacing them with remote I/O operations as desired.


Fiona is designed to be simple and dependable. It focuses on reading and writing data in standard Python IO style, and relies upon familiar Python types and protocols such as files, dictionaries, mappings, and iterators instead of classes specific to OGR. Fiona can read and write real-world data using multi-layered GIS formats and zipped virtual file systems and integrates readily with other Python GIS packages such as pyproj, Rtree, and Shapely.


Firedrake is an automated system for the portable solution of partial differential equations using the finite element method (FEM). Firedrake enables users to employ a wide range of discretisations to an infinite variety of PDEs and employ either conventional CPUs or GPUs to obtain the solution.

Firedrake employs the Unifed Form Language (UFL) and FEniCS Form Compiler (FFC) from the FEniCS Project while the parallel execution of FEM assembly is accomplished by the PyOP2 system. The global mesh data structures, as well as


ForestGOMP is an OpenMP runtime compatible with GCC 4.2, offering a structured way to efficiently execute OpenMP applications onto hierarchical (NUMA) architectures.


FortranCL is an OpenCL interface for Fortran 90. It allows programmers to call the OpenCL parallel programming framework directly from Fortran, so developers can accelerate their Fortran code using graphical processing units (GPU) and other accelerators.

The interface is designed to be as close to C OpenCL interface as possible, while written in native Fortran 90 with type checking. It was originally designed as an OpenCL interface to be used by the Octopus code.

The interface is not complete but provides all the basic calls required to write a full Fortran 90 OpenCL program.


Freenet is free software which lets you anonymously share files, browse and publish "freesites" (web sites accessible only through Freenet) and chat on forums, without fear of censorship. Freenet is decentralised to make it less vulnerable to attack, and if used in "darknet" mode, where users only connect to their friends, is very difficult to detect.

Communications by Freenet nodes are encrypted and are routed through other nodes to make it extremely difficult to determine who is requesting the information and what its content is.

Users contribute to the network by giving bandwidth and a portion of their hard drive (called the "data store") for storing files. Files are automatically kept or deleted depending on how popular they are, with the least popular being discarded to make way for newer or more popular content. Files are encrypted, so generally the user cannot easily discover what is in his datastore, and hopefully can’t be held accountable for it. Chat forums, websites, and search functionality, are all built on top of this distributed data store.


Furious.js is a scientific computing package for JavaScript that was inspired by Numpy.


Gaigen is a program which can generate implementations of geometric algebras. It generates C and C source code which implements a geometric algebra requested by the user. The choice to create a program which generates implementations of these algebras was made because we wanted performance similar to optimized hand-written code, while maintaining full generality; for (scientific) research and experimentation, many geometric algebras with different dimensionality, signatures and other properties may be required. Instead of coding each algebra by hand, Gaigen provides the possibility to generate the code for exactly the geometric algebra the user requires. This code may be less efficient than fully optimized hand-written code, but is likely to be much more efficient than one library which tries to support all possible algebras at once. Gaigen supports algebras with a dimension from 0 to 8. The implementation of products used in Gaigen becomes infeasable for dimensions higher than about 7 or 8. For basis vectors, all 3 signatures are supported (-1, 0, +1). It is also possible to create reciprocal pairs of null vectors, which square to 0 with themselves, but to +1 or -1 with the other. 7 basic products are implemented (geometric product, outer product, left and right contraction, scalar product, (modified) Hestenes inner product) plus the outer morphism operator and the delta product. Several useful functions (such as factorization, meet and join) have been implemented. Everything has been designed with memory and time efficiency in mind. It is possible to optimize Gaigen for your platform, application or processor by replacing the lowest computation layer. Gaigen can suggest optimizations for the algebras you generate with it by using the provided profiler function. Benchmarks in a ray tracing application show that Gaigen is 30 to 60 times faster than CLU (C). In another application, Gaigen was 6000 times faster than Gable (Matlab).


GALAHAD is a thread-safe library of Fortran 2003 packages for solving nonlinear optimization problems. At present, the areas covered by the library are unconstrained and bound-constrained optimization, quadratic programming, nonlinear programming, systems of nonlinear equations and inequalities, and nonlinear least squares problems.


Galois is a system that automatically executes "Galoized" serial C++ or Java code in parallel on shared-memory machines. It works by exploiting amorphous data-parallelism, which is present even in irregular codes that are organized around pointer-based data structures such as graphs and trees. The Galois system includes the Lonestar benchmark suite and the ParaMeter profiler.

Multicore processors are becoming increasingly the norm. As a result, we need to find ways to make it easier to write parallel programs. Galois allows the programmer to write serial C++ or Java code while still getting the performance of parallel execution. All the programmer has to do is use Galois-provided data structures, which are necessary for correct concurrent execution, and annotate which loops should be run in parallel. The Galois system then speculatively extracts as much parallelism as it can. The current release includes a dozen sample benchmarks applications from a broad range of domains that are written using the Galois extensions and classes.

Lonestar and LonestarGPU benchmark collections are collections of widely-used real-world applications that exhibit irregular behavior.


Galry is a high performance interactive visualization package in Python based on OpenGL. It allows to interactively visualize very large plots (tens of millions of points) in real time, by using the graphics card as much as possible.

Galry’s high-level interface is directly inspired by Matplotlib and Matlab. The low-level interface can be used to write complex interactive visualization GUIs with Qt that deal with large 2D/3D datasets.

Visualization capabilities of Galry are not restricted to plotting, and include textures, 3D meshes, graphs, shapes, etc. Custom shaders can also be written for advanced uses.


Robot simulation is an essential tool in every roboticist’s toolbox. A well-designed simulator makes it possible to rapidly test algorithms, design robots, and perform regression testing using realistic scenarios. Gazebo offers the ability to accurately and efficiently simulate populations of robots in complex indoor and outdoor environments. At your fingertips is a robust physics engine, high-quality graphics, and convenient programmatic and graphical interfaces.


G-code (also RS-274), which has many variants, is the common name for the most widely used numerical control (NC) programming language. It is used mainly in computer-aided manufacturing for controlling automated machine tools. G-code is sometimes called G programming language.

In fundamental terms, G-code is a language in which people tell computerized machine tools how to make something. The how is defined by instructions on where to move, how fast to move, and through what path to move. The most common situation is that, within a machine tool, a cutting tool is moved according to these instructions through a toolpath, cutting away excess material to leave only the finished workpiece. The same concept also extends to noncutting tools such as forming or burnishing tools, photoplotting, additive methods such as 3D printing, and measuring instruments.


Skeinforge is a tool chain composed of Python scripts that converts your 3D model into G-Code instructions for RepRap.


A translator library for raster and vector geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single raster abstract data model and vector abstract data model to the calling application for all supported formats. It also comes with a variety of useful commandline utilities for data translation and processing.


Reads and writes shapefiles using GDAL in the background and is therefore quite fast. Intended as an easier to use alternative than the original Python GDAL bindings.

Fiona is designed to be simple and dependable. It focuses on reading and writing data in standard Python IO style and relies upon familiar Python types and protocols such as files, dictionaries, mappings, and iterators instead of classes specific to OGR. Fiona can read and write real-world data using multi-layered GIS formats and zipped virtual file systems and integrates readily with other Python GIS packages such as pyproj, Rtree, and Shapely.


GDAL Python wrapper for reading and writing geospatial data to a variety of vector formats.


Rasterio reads and writes geospatial raster datasets. It employs GDAL under the hood for file I/O and raster formatting. Its functions typically accept and return Numpy ndarrays. Rasterio is designed to make working with geospatial raster data more productive.


The GeM software is designed to automate the generation of determining equations and related operations, in order to compute symmetries and conservation laws for any ODE/PDE system, generally without limitations in DE order and number of variables.

ODE/PDE systems containing arbitrary functions and/or constants can be analyzed, and classes of functions for which additional symmetries / conservation laws occur can be isolated.

GeM output (determining equations) is usually fed into Maple "rifsimp" (a stable routine for differential reduction), which simplifies determining equations, and performs case splits when the given system contains arbitrary functions and/or constants.

GeM also contains special routines to output computed symmetries as well as fluxes/densities of computed conservation laws.


This project provides tools to play with geographical data. It also works with non-geographical data, except for map visualizations. There are embedded data sources in the project, but you can easily play with your own data in addition to the available ones. Csv files containing data about airports, train stations, countries, … are loaded, then you can:

  • performs various types of queries ( find this key, or find keys with this property)

  • fuzzy searches based on string distance ( find things roughly named like this)

  • geographical searches ( find things next to this place)

  • get results on a map, or export it as csv data, or as a Python object


A JavaScript framework that combines the GIS functionality of OpenLayers with the user interface of the ExtJS library. It enables the construction of desktop-like GIS applications on the web.


High-level components for GeoExt-based applications.


Geogram is a programming library of geometric algorithms. It includes a simple yet efficient Mesh data structure (for surfacic and volumetric meshes), exact computer arithmetics (a-la Shewchuck, implemented in GEO::expansion), a predicate code generator (PCK: Predicate Construction Kit), standard geometric predicates (orient/insphere), Delaunay triangulation, Voronoi diagram, spatial search data structures, spatial sorting) and less standard ones (more general geometric predicates, intersection between a Voronoi diagram and a triangular or tetrahedral mesh embedded in n dimensions). The latter is used by FWD/WarpDrive, the first algorithm that computes semi-discrete Optimal Transport in 3d that scales up to 1 million Dirac masses.[+


JavaScript Geo visualization and Analysis Library.


GeoJSON[1] is an open standard format for encoding collections of simple geographical features along with their non-spatial attributes using JavaScript Object Notation. The features include points (therefore addresses and locations), line strings (therefore streets, highways and boundaries), polygons (countries, provinces, tracts of land), and multi-part collections of these types. GeoJSON features need not represent entities of the physical world only; mobile routing and navigation apps, for example, might describe their service coverage using GeoJSON.[2]

The GeoJSON format differs from other GIS standards in that it was written and is maintained not by a formal standards organization, but by an Internet working group of developers.


A simple Python GeoJSON file reader and writer. PyGeoj treats GeoJSON as an actual file format instead of a set of formatting rules.


Python bindings and utilities for GeoJSON.


An extension of GeoJSON that encodes topology. Rather than representing geometries discretely, geometries in TopoJSON files are stitched together from shared line segments called arcs.[18] Arcs are sequences of points, while line strings and polygons are defined as sequences of arcs. Each arc is defined only once, but can be referenced several times by different shapes, thus reducing redundancy and decreasing the file size.[19] In addition, TopoJSON facilitates applications that use topology, such as topology-preserving shape simplification, automatic map coloring, and cartograms.


A powerful, metadata-driven Spatial ETL tool dedicated to the integration of different spatial data sources for building and updating geospatial data warehouses. GeoKettle enables the Extraction of data from data sources, the Transformation of data in order to correct errors, make some data cleansing, change the data structure, make them compliant to defined standards, and the Loading of transformed data into a target DataBase Management System (DBMS) in OLTP or OLAP/SOLAP mode, GIS file or Geospatial Web Service.


GeoMapApp is an earth science exploration and visualization application that is continually being expanded as part of the Marine Geoscience Data System (MGDS) at the Lamont-Doherty Earth Observatory of Columbia University. The application provides direct access to the Global Multi-Resolution Topography (GMRT) compilation that hosts high resolution (~100 m node spacing) bathymetry from multibeam data for ocean areas and ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) and NED (National Elevation Dataset) topography datasets for the global land masses.


The GEOS–Chem model is a global three-dimensional model of tropospheric chemistry driven by assimilated meteorological observations from the Goddard Earth Observing System (GEOS) of the NASA Global Modeling Assimilation Office.

GEOS–Chem began as a merging of Mian Chin’s GEOS–CTM code with the emissions, dry deposition, and chemistry routines from the old Harvard–GISS 9-layer model. Since then, we have added many updates and improvements to GEOS–Chem. The model now uses detailed inventories for fossil fuel, biomass burning, biofuel burning, biogenic, and aerosol emissions. GEOS–Chem includes state-of-the-art transport (TPCORE) and photolysis (FAST–J) routines, as well as the SMVGEAR II chemistry solver package. Detailed aerosol microphysical simulations using GEOS–Chem may performed with the TOMAS aerosol microphysics code or the APM aerosol microphysics code.

GEOS–Chem has been parallelized using the OpenMP compiler directives, and it scales well when running across multiple CPU’s on shared-memory machines. We are currently building a Grid-Independent version of GEOS-Chem in order to take advantage of distributed memory architectures and MPI parallelization.

Several software tools facilitate the visualization of GEOS-Chem model outputs. The IDL-based GAMAP package—which is developed and maintained by the GEOS–Chem Support Team—allows for easy generation of a wide variety of plots and animations. Furthermore, several members of the GEOS-Chem user community are now developing open-source software visualization tools for other computer languages, including Matlab, NCL, R, and Python.

geoscience data servers


The Dapper Data Viewer (aka DChart) allows you to visualize and download in-situ oceanographic or atmospheric data from file or OpenDap server. Features include an interactive map that is draggable, an in-situ station layer that allows you to select data stations, and a plot window that allows you to plot data from one or more stations. Three plot types are supported (profile, property-property, and time series) and users can interact directly with the plot to pan or zoom in and out.

Dapper is an OPeNDAP/Java-based web server developed by the EPIC group at PMEL that provides networked access to in-situ and gridded data. The Dapper servlet contains a set of configurable services that convert in-situ or gridded data to the OPeNDAP protocol.


ERDDAP is a data server that gives you a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. This particular ERDDAP installation has oceanographic data.


An implementation of a broker catalog service that allows clients to discover and evaluate geoinformation resources over a federation of data sources, and publishes different catalog interfaces, allowing different clients to use the service. A data provider can deploy his/her own GI-cat instance, grouping together disparate data sources, to accommodate his/her users' needs.

GI-cat features caching and mediation capabilities and can act as a broker towards disparate catalog and access services: by implementing metadata harmonization and protocol adaptation, it is able to transform query results to a uniform and consistent interface. GI-cat is based on a service-oriented framework of modular components and can be customized and tailored to support different deployment scenarios.

GI-cat can access a multiplicity of catalogs services, as well as inventory and access services to discover, and possibly access, heterogeneous ESS resources. Specific components implement mediation services for interfacing heterogeneous service providers which expose multiple standard specifications; they are called Accessors. These mediating components map the heterogeneous providers metadata models into a uniform data model which implements ISO 19115, based on official ISO 19139 schemas and its extensions (check out more information about the internal GI-cat format) . Accessors also implement the query protocol mapping; they translate the query requests expressed according to the interface protocols exposed by GI-cat, into the multiple query dialects spoken by the resource service providers. Currently, a number of well-accepted catalog and inventory services are supported, including several OGC Web Services (e.g. WCS, WMS), THREDDS Data Server, SeaDataNet Common Data Index, and GBIF.


Hyrax is a new data server which combines the efforts at UCAR/HAO to build a high performance DAP-compliant data server for the Earth System Grid II project with existing software developed by OPeNDAP.

Hyrax uses the Java servlet mechanism to hand off requests from a general web daemon to DAP format-specific software. This results in higher performance for small requests. The servlet front end, which we call the OPeNDAP Lightweight Front end Server (OLFS) looks at each request and formulates a query to a second server (which may or may not on the same machine as the OLFS) called the Back End Server (BES).

The BES is the high-performance server software from HAO. It handles reading data from the data stores and returning DAP-compliant responses to the OLFS. In turn, the OLFS may pass these response back to the requestor with little or no modification or it may use them to build more complex responses. The nature of the Inter Process Communication (IPC) between the OLFS and BES is such that they should both be on the same machine or be able to communicate over a very high bandwidth channel.


The Live Access Server (LAS) is a highly configurable web server designed to provide flexible access to geo-referenced scientific data. It can present distributed data sets as a unified virtual data base through the use of DODS networking. Ferret is the default visualization application used by LAS.


The THREDDS Data Server (TDS) is a web server that provides metadata and data access for scientific datasets, using a variety of remote data access protocols.


Python Geographic Visualizer (GeoVis) is a standalone geographic visualization module for the Python programming language intended for easy everyday-use by novices and power-programmers alike. It has one-liners for quickly visualizing a shapefile, building and styling basic maps with multiple shapefile layers, and/or saving to imagefiles. Uses the built-in Tkinter or other third-party rendering modules to do its main work.


A control system framework for personal fabrication.


Gestalt is a framework for building controllers for automated tools. It enables you to import your machines as Python modules, and makes it easy to connect machines to browser-based user interfaces.


Da bomb.


A git repository browser that can generate static HTML instead of having to run dynamically.

It is smaller, with less features and a different set of tradeoffs than other similar software, so if you’re looking for a robust and featureful git browser, please look at gitweb or cgit instead.

However, if you want to generate static HTML at the expense of features, then it can be useful.


A sleek and powerful git GUI that is written in Python.


The hub subcommand for git, allows you to perform many of the operations made available by GitHub’s v3 REST API, from the git commandline command.

You can fork, create, delete and modify repositories. You can get information about users, repositories and issues. You can star, watch and follow things, and find out who else is doing the same. The API is quite extensive. With this command you can do many of your day to day GitHub actions without needing a web browser.

You can also chain commands together using the output of one as the input of another. For example you could use this technique to clone all the repos of a GitHub user or organization, with one command.


GitLab is an advanced Git-repository manager. It introduces a powerful code review and issue-tracking system, complete with GitLab CI: a powerful continuous integration tool.


Gitless is an experimental version control system built on top of Git. Many people complain that Git is hard to use. We think the problem lies deeper than the user interface, in the concepts underlying Git. Gitless is an experiment to see what happens if you put a simple veneer on an app that changes the underlying concepts. Because Gitless is implemented on top of Git (could be considered what Git pros call a porcelain of Git), you can always fall back on Git. And of course your coworkers you share a repo with need never know that you’re not a Git aficionado.


This git command "clones" an external git repo into a subdirectory of your repo. Later on, upstream changes can be pulled in, and local changes can be pushed back. Simple.


Givaro is a C library for arithmetic and algebraic computations. Its main features are implementations of the basic arithmetic of many mathematical entities: Primes fields, Extensions Fields, Finite Fields, Finite Rings, Polynomials, Algebraic numbers, Arbitrary precision integers and rationals (C wrappers over gmp) It also provides data-structures and templated classes for the manipulation of basic algebraic objects, such as vectors, matrices (dense, sparse, structured), univariate polynomials (and therefore recursive multivariate). It contains different program modules and is fully compatible with the LinBox linear algebra library and the KAAPI kernel for Adaptative, Asynchronous Parallel and Interactive programming.


Gizeh is a Python library for vector graphics. Gizeh is written on top of the module cairocffi, which is a Python binding of the popular C library Cairo. Cairo is powerful, but difficult to learn and use. Gizeh implements a few classes on top of Cairo that make it more intuitive.


This site summarises the 1D lake water balance and vertical stratification model: “The General Lake Model” (GLM).


Lasso and elastic-net regularized generalized linear models. This is a Matlab port for the extremely efficient procedures for fitting the entire lasso or elastic-net path for linear regression, logistic and multinomial regression, Poisson regression the Cox model.


This page contains the codes for learning the Granger causality in different settings. The codes are written in Matlab and depend on the GLMnet package for performing Lasso. Lasso-Granger is an efficient algorithm for learning the temporal dependency among multiple time series based on variable selection using Lasso. Copula-Granger extends the power of Lasso-Granger to non-linear datasets. It uses the copula technique to separate the marginal properties of the joint distribution from its dependency structure.


We describe glsim, a C++ library designed to provide routines to perform basic housekeeping tasks common to a very wide range of simulation programs, such as reading simulation parameters or reading and writing self-describing binary files with simulation data. The design also provides a framework to add features to the library while preserving its structure and interfaces.


Glumpy is a python library for scientific visualization that is both fast, scalable and beautiful. Glumpy offers an intuitive interface between numpy and modern OpenGL.

GNU Radio

GNU Radio is a free software development toolkit that provides the signal processing runtime and processing blocks to implement software radios using readily-available, low-cost external RF hardware and commodity processors. It is widely used in hobbyist, academic and commercial environments to support wireless communications research as well as to implement real-world radio systems.

GNU Radio applications are primarily written using the Python programming language, while the supplied, performance-critical signal processing path is implemented in C++ using processor floating point extensions where available. Thus, the developer is able to implement real-time, high-throughput radio systems in a simple-to-use, rapid-application-development environment.


Gqrx is a software defined radio receiver powered by the GNU Radio SDR framework and the Qt graphical toolkit. Gqrx supports many of the SDR hardware available, including Funcube Dongles, rtl-sdr, HackRF and USRP devices.


Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.

Google Earth Engine

Google Earth Engine brings together the world’s satellite imagery — trillions of scientific measurements dating back almost 40 years — and makes it available online with tools for scientists, independent researchers, and nations to mine this massive warehouse of data to detect changes, map trends and quantify differences on the Earth’s surface. Applications include: detecting deforestation, classifying land cover, estimating forest biomass and carbon, and mapping the world’s roadless areas.


GPI-2 implements the GASPI specification (, an API specification which originates from the ideas and concepts GPI. GPI-2 is an API for asynchronous communication. It provides a flexible, scalable and fault tolerant interface for parallel applications.


GPTIPS is a free symbolic data mining platform and interactive modelling environment for MATLAB.


Gradle is an open source build automation system. Gradle can automate the building, testing, publishing, deployment and more of software packages or other types of projects such as generated static websites, generated documentation or indeed anything else.

Gradle is a project automation tool that builds upon the concepts of Apache Ant and Apache Maven and introduces a Groovy-based domain-specific language (DSL) instead of the more traditional XML form of declaring the project configuration.

Gradle was designed for multi-project builds which can grow to be quite large, and supports incremental builds by intelligently determining which parts of the build tree are up-to-date, so that any task dependent upon those parts will not need to be re-executed.

The initial plugins are primarily focused around Java, Groovy and Scala development and deployment, but more languages and project workflows are on the roadmap.


Graphite is an open-source, distributed parallel simulator for multicore architectures. Graphite is designed from the ground up for exploration of future multicore processors containing dozens, hundreds, or even thousands of cores. It provides high performance for fast design space exploration and software development.

Graphite (3D)

Graphite is a research platform for computer graphics, 3D modeling and numerical geometry.


Grappa makes an entire cluster look like a single, powerful, shared-memory machine. By leveraging the massive amount of concurrency in large-scale data-intensive applications, Grappa can provide this useful abstraction with high performance. Unlike classic distributed shared memory (DSM) systems, Grappa does not require spatial locality or data reuse to perform well.

Data-intensive, or "Big Data", workloads are an important class of large-scale computations. However, the commodity clusters they are run on are not well suited to these problems, requiring careful partitioning of data and computation. A diverse ecosystem of frameworks have arisen to tackle these problems, such as MapReduce, Spark, Dryad, and GraphLab, which ease development of large-scale applications by specializing to particular algorithmic structure and behavior.

Grappa provides abstraction at a level high enough to subsume many performance optimizations common to these data-intensive platforms. However, its relatively low-level interface provides a convenient abstraction for building data-intensive frameworks on top of. Prototype implementations of (simplified) MapReduce, GraphLab, and a relational query engine have been built on Grappa that out-perform the original systems.


The Monash simple climate model is based on the Globally Resolved Energy Balance (GREB) model, which is a climate model published by Dommenget and Floeter [2011] in the international peer review science journal Climate Dynamics. The model simulates most of the main physical processes in the climate system in a very simplistic way and therefore allows very fast and simple climate model simulations. It can compute global climate simulations of one year in about 1 second on a normal PC computer. Despite its simplicity the model simulates the climate response to external forcings, such as doubling of the CO2 concentrations very realistically (similar to state of the art climate models).


In gRPC a client application can directly call methods on a server application on a different machine as if it was a local object, making it easier for you to create distributed applications and services. As in many RPC systems, gRPC is based around the idea of defining a service, specifying the methods that can be called remotely with their parameters and return types. On the server side, the server implements this interface and runs a gRPC server to handle client calls. On the client side, the client has a stub that provides exactly the same methods as the server.

gRPC clients and servers can run and talk to each other in a variety of environments - from servers inside Google to your own desktop - and can be written in any of gRPC’s supported languages. So, for example, you can easily create a gRPC server in Java with clients in Go, Python, or Ruby. In addition, the latest Google APIs will have gRPC versions of their interfaces, letting you easily build Google functionality into your applications.


The community GSI system is a variational data assimilation system, designed to be flexible, state-of-art, and run efficiently on various parallel computing platforms. The GSI system is in the public domain and is freely available for community use.

The Developmental Testbed Center (DTC) currently maintains and supports a community version of the GSI system (now at Version 3.3). The testing and support of this GSI system at the DTC currently focus on regional numerical weather prediction (NWP) applications coupled with the Weather Research and Forecasting (WRF) Model , but the GSI can be applied to Global Forecast System(GFS) as well as other modelling systems.


The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.


CythonGSL provides a Cython interface for the GNU Scientific Library (GSL). Cython is the ideal tool to speed up numerical computations by converting typed Python code to C and generating Python wrappers so that these compiled functions can be called from Python. Scientific programming often requires use of various numerical routines (e.g. numerical integration, optimization). While SciPy provides many of those tools, there is an overhead associated with using these functions within your Cython code. CythonGSL allows you to shave off that last layer by providing Cython declarations for the GSL which allow you to use this high-quality library from within Cython without any Python overhead.


A portable, object-based Fortran interface to the GNU scientific library, a collection of numerical routines for scientific computing.


The GNU Scientific Library for Lisp (GSLL) allows you to use the GNU Scientific Library (GSL) from Common Lisp. This library provides a full range of common mathematical operations useful to scientific and engineering applications. The design of the GSLL interface is such that access to most of the GSL library is possible in a Lisp-natural way; the intent is that the user not be hampered by the restrictions of the C language in which GSL has been written. GSLL thus provides interactive use of GSL for getting quick answers, even for someone not intending to program in Lisp.


O2scl is a C++ library for object-oriented numerical programming. It includes interpolation, differentiation, integration, roots of polynomials, equation solving, minimization, constrained minimization, Monte Carlo integration, simulated annealing, least-squares fitting, solution of ordinary differential equations, two-dimensional interpolation, Chebyshev approximation, unit conversions, and file I/O with HDF5.


A Fortran90 input/output library, "gtool5", is developed for use with numerical simulation models in the fields of Earth and planetary sciences. The use of this library will simplify implementation of input/output operations into program code in a consolidated form independent of the size and complexity of the software and data. The library also enables simple specification of the metadata needed for post-processing and visualization of the data. These aspects improve the readability of simulation code, which facilitates the simultaneous performance of multiple numerical experiments with different software and efficiency in examining and comparing the numerical results.


GUESS is an exploratory data analysis and visualization tool for graphs and networks. The system contains a domain-specific embedded language called Gython (an extension of Python, or more specifically Jython) which supports the operators and syntactic sugar necessary for working on graph structures in an intuitive manner. An interactive interpreter binds the text that you type in the interpreter to the objects being visualized for more useful integration. GUESS also offers a visualization front end that supports the export of static images and dynamic movies.


Gun is a persisted distributed cache, part of a NoDB movement. It requires zero maintenance and runs on your own infrastructure. Think of it as "Dropbox for Databases" or a "Self-hosted Firebase". This is an early preview, so check out the github and read on.

Everything gets cached, so your users experience lightning fast response times. Since gun can be embedded anywhere javascript can run, that cache can optionally be right inside your user’s browser using localstorage fallbacks. Updates are then pushed up to the servers when the network is available.


gvSIG Community Edition (CE) is a community driven GIS project fork of gvSIG that will be bundled with SEXTANTE and GRASS GIS. This project is not supported by the gvSIG Association. gvSIG CE is not an official project of gvSIG. gvSIG CE is a fully functional Open Source Desktop GIS that provides powerful visualization (including thematic maps, advanced symbology and labelling), cartography, raster, vector and geoprocessing in a single, integrated software suite.


An optimized HTTP server with support for HTTP/1.x and HTTP/2. H2O is a very fast HTTP server written in C. It can also be used as a library.


The H5FDdsm project provides a Virtual File Driver for HDF5, which can be used to link two applications via a virtual file system. One application (server/host) owns a memory buffer, which may be distributed over N processes (DSM buffer) - the second application (client) writes to HDF5 in parallel using M processes and the data is diverted to the DSM host, where it can be read in parallel as if from disk. The file system is bypassed completely and the data is transmitted using one of several network protocols (MPI or TCP over sockets currently supported). Note that the interface can also be used within the same application as a parallel data staging layer, in this case, no connection is required and information is exchanged between processes using MPI.


H5Part is a very simple data storage schema and provides an API that simplifies the reading/writing of the data to the HDF5 file format.

H5Part is a very simple data storage schema and provides an API that simplifies the reading/writing of the data to the HDF5 file format. An important foundation for a stable visualization and data analysis environment is a stable and portable file storage format and its associated APIs. The presence of a "common file storage format," including associated APIs, will help foster a fundamental level of interoperability across the project’s software infrastructure. It will also help ensure that key data analysis capabilities are present during the earliest phases of the software development effort.


ICARUS is a ParaView plug-in interfaced around the H5FDdsm driver for steering and visualizing in-situ HDF5 output of simulation codes.


The Habanero-C (HC) language under development in the Habanero project at Rice University builds on past work on Habanero-Java, which in turn was derived from X10 v1.5. HC serves as a research testbed for new compiler and runtime software technologies for extreme scale systems for homogeneous and heterogeneous processors.

Habanero-C is designed to be mapped onto hardware platforms with lightweight system software stacks, such as the Customizable Heterogeneous Platform (CHP) being developed in the NSF Expeditions Center for Domain-Specific Computing (CDSC) which includes CPUs, GPUs, and FPGAs. The C foundation also makes it easier to integrate HC with communication middleware for cluster systems, such as MPI and GASNet.

The Habanero-C compiler is written in C++ and is built on top of the ROSE compiler infrastructure, which was also used in the DARPA-funded PACE project at Rice University. The bulk of the Habanero-C runtime has been written from scratch in portable ANSI C. However, a few library routines for low-level synchronization and atomic operations are written in assembly language for the target platform. To date, the Habanero-C runtime has been ported and tested on Intel X86, Cyclops 64, Power7, Sun Niagara 2 and Intel SCC multicore platforms.


The Apache Hadoop project provides an open-source framework for reliable, scalable, distributed computing. As such, it can be deployed and used in the Grid 5000 platform. However, its configuration and management may be sometimes difficult, specially under the dynamic nature of clusters within Grid 5000 reservations. In turn, Execo offers a Python API to manage processes execution. It is well suited for quick and easy creation of reproducible experiments on distributed hosts.

The project presented here is called hadoop_g5k and provides a layer built on top of Execo that allows to manage Hadoop clusters and prepare reproducible experiments in Hadoop. It offers a set of scripts to be used in command-line interfaces and a Python interface.


A Python library that allows you to finely manage unix processes on thousands of remote hosts.


HClib is a library implementation of the Habanero-C language. The reference HClib implementation is built on top of the Open Community Runtime (OCR).


How can we find useful patterns and anomalies in large scale real-world data with multiple attributes? Tensors are suitable for modeling these multidimensional data, and widely used for the analysis of social networks, web data, network traffic, and in many other settings. HaTen2 is a scalable distributed algorithm of tensor decomposition for large scale tensors running on the MapReduce platform. HaTen2 decomposes 100X larger tensors compared to existing methods.


The Haxe programming language is a high level strictly typed programming language which is used by the Haxe compiler to produce cross-platform native code. The Haxe programming language is easy to learn if you are familiar already with either Java,C++,PHP,AS3 or similar object oriented languages. The Haxe programming language has been especially designed in order to adapt the various platforms native behaviors and allow efficient cross-platform development.

The Haxe Compiler is responsible for translating the Haxe programming language to the target platform native source code or binary. Each platform is natively supported, without any overhead coming from running inside a virtual machine. The Haxe Compiler is very efficient and can compile thousands of classes in seconds.

The Haxe standard library provides a common set of highly tested APIs that gives you complete cross-platform behavior. This includes data structures, maths and date, serialization, reflection, bytes, crypto, file system, database access, etc. The Haxe standard library also includes platform-specific API that gives you access to important parts of the platform capabilities, and can be easily extended.

The compiler targets include Flash, Neko, Javascript, Actionscript 3, PHP, C++, Java, Csharp and Python.

Haxe is written in OCaml.

Haxe UI

Create cross-platform, rich user interfaces. Quickly with a single framework.


Massive provide a number of open source libraries and tools that are intended to increase the quality, efficiency and consistency of cross-platform development with Haxe.


Haxelib which downloads node-webkit binary for your platform and makes it accessible via haxelib run node-webkit path/to/index.html. Node Webkit lets you run a Webkit shell on the desktop, meaning you can use Haxe and HTML5 / JS technologies to build your app. It provides full access to the NodeJS APIs so your app can integrate with the system.


Use WxWidgets to create desktop apps with a truly native look and feel on all major platforms. Works with the C++ and Neko targets, and integrates with NME.


An important goal of scientific data analysis is to understand the behavior of a system or process based on a sample of the system. In many instances it is possible to observe both input parameters and system outputs, and characterize the system as a high-dimensional function. Such data sets arise, for instance, in large numerical simulations, as energy landscapes in optimization problems, or in the analysis of image data relating to biological or medical parameters. This paper proposes an approach to analyze and visualizing such data sets. The proposed method combines topological and geometric techniques to provide interactive visualizations of discretely sampled high-dimensional scalar fields. The method relies on a segmentation of the parameter space using an approximate Morse-Smale complex on the cloud of point samples. For each crystal of the Morse-Smale complex, a regression of the system parameters with respect to the output yields a curve in the parameter space. The result is a simplified geometric representation of the Morse-Smale complex in the high dimensional input domain. Finally, the geometric representation is embedded in 2D, using dimension reduction, to provide a visualization platform. The geometric properties of the regression curves enable the visualization of additional information about each crystal such as local and global shape, width, length, and sampling densities. The method is illustrated on several synthetic examples of two dimensional functions. Two use cases, using data sets from the UCI machine learning repository, demonstrate the utility of the proposed approach on real data. Finally, in collaboration with domain experts the proposed method is applied to two scientific challenges. The analysis of parameters of climate simulations and their relationship to predicted global energy flux and the concentrations of chemical species in a combustion simulation and their integration with temperature.


Adaptive, or self-aware, computing has been proposed as one method to help application programmers confront the growing complexity of multicore software development.

However, existing approaches to adaptive systems are largely ad hoc and often do not manage to incorporate the true performance goals of the applications they are designed to support.

This project proposed an enabling technology for adaptive computing systems: Application Heartbeats. The Application Heartbeats framework provides a simple, standard programming interface that applications can use to indicate their performance and system software (and hardware) can use to query an application’s performance.


Hermes2D (Higher-order modular finite element system) is a C++/Python library of algorithms for rapid development of adaptive hp-FEM solvers. hp-FEM is a modern version of the finite element method (FEM) that is capable of extremely fast, exponential convergence.

The Hermes library can be used for a large variety of PDE problems ranging from linear elliptic equations to time-dependent nonlinear multi-physics PDE systems arising in elasticity, structural mechanics, fluid mechanics, acoustics, electromagnetics, and other fields of computational engineering and science.

The Documentation for the Hermes libraries is an extensive set of instructions, information and tutorials related to the use of Hermes and the Finite Element Method. Hermes includes instructions for the installation of collaborating Third Party Libraries (TPLs) as well as an introduction to the mathematics behind the hp-FEM method and detailed instructions on the use and modification of the code.


HHVM is an open-source virtual machine designed for executing programs written in Hack and PHP. HHVM uses a just-in-time (JIT) compilation approach to achieve superior performance while maintaining the development flexibility that PHP provides.


A massively-parallel high-performance x-ray scattering data analysis code. HipGISAXS is a massively parallel software, which we have developed using C++, augmented with MPI, Nvidia CUDA, OpenMP, and parallel-HDF5 libraries, on large-scale clusters of multi/many-cores and graphics processors. HipGISAXS currently supports *NIX based systems, and is able to harness computational power from any general-purpose CPUs including state-of-the-art multicores, as well as Nvidia GPUs and Intel MIC coprocessors. It is able to handle large input data including any custom complex morphology as described in the following, and perform GISAXS simulations at high resolutions.


HLib is a program library for hierarchical matrices and H2-matrices. H-matrices are a powerful tool for representing and working with dense (and sparse) matrices, e.g. from integral or partial differential equations. They allow the complete matrix algebra, e.g. matrix-vector multiplication, matrix addition, multiplication, inversion and factorisation in almost linear time with respect to the number of rows and columns.

HLIBpro contains various algorithms for the approximation of dense matrices, e.g. ACA and HCA, the complete set of available H-algebra, various clustering techniques, e.g. geometric and algebraic clustering, many functions for discretising integral equations, e.g. Laplace, Helmholtz and Maxwell equations. A special focus of HLIBpro lies in the parallelisation of these methods to shared (threads) and distributed memory machines (MPI).


Higher Order (Symplectic) Methods in Python are explicit algorithms for higher order symplectic integration of a large class of Hamilton’s equations have recently been discussed by Mushtaq et al. Here we present a Python program for automatic numerical implementation of these algorithms for a given Hamiltonian, both for double precision and multiprecision computations. We provide examples of how to use this program, and illustrate behavior of both the code generator and the generated solver module(s).


HOP is a multi-tier programming language for the Web 2.0 and the so-called diffuse Web. It is designed for programming interactive web applications in many fields such as multimedia (web galleries, music players, …​), ubiquitous and house automation (SmartPhones, personal appliance), mashups, office (web agendas, mail clients, …​), etc.

HOP features include:

  • an extensive set of widgets for programming fancy and portable Web GUIs,

  • full compatibility with traditional Web technologies (JavaScript, HTML, CSS),

  • HTML5 support,

  • a versatile Web server supporting HTTP/1.0 and HTTP/1.1,

  • native multimedia support for enabling ubiquitous Web multimedia applications,

  • fast WebDAV level 1 support,

  • an optimizing native code compiler for server code,

  • an on-the-fly JavaScript compiler for client code,

  • an extensive set of libraries for the mail, calendars, databases, Telephony


A Python Just-In-Time compiler for astrophysical computations. In order to combine the ease of Python and the speed of C, we developed HOPE, a specialised Python just-in-time (JIT) compiler designed for numerical astrophysical applications. HOPE focuses on a subset of the language and is able to translate Python code into C while performing numerical optimisation on mathematical expressions at runtime. To enable the JIT compilation, the user only needs to add a decorator to the function definition. We assess the performance of HOPE by performing a series of benchmarks and compare its execution speed with that of plain Python, C and the other existing frameworks. We find that HOPE improves the performance compared to plain Python by a factor of 2 to 120, achieves speeds comparable to that of C, and often exceeds the speed of the existing solutions.


HPC-GAP is the EPSRC funded project to reengineer the software for computation in algebra and discrete mathematics to take advantage of the power of current and future high-performance computers. Our main focus is on the GAP system and the more recent SymGridPar middleware, which provide flexible and effective computation on single processors and small clusters. We will adapt the software to efficiently use large clusters of multi-core processors to perform larger computations. To demonstrate the effectiveness of our adaptations we will apply our new software to problems from a number of important areas of pure mathematics.


The High Performance Geostatistics Library is written in C++ / Python to realize some geostatistical algorithms. The algorithms are called in Python, by executing the corresponding commands.


HPGMG implements full multigrid (FMG) algorithms using finite-volume and finite-element methods. Different algorithmic variants adjust the arithmetic intensity and architectural properties that are tested. These FMG methods converge up to discretization error in one F-cycle, thus may be considered direct solvers. An F-cycle visits the finest level a total of two times, the first coarsening (8x smaller) 4 times, the second coarsening 6 times, etc.

HPGMG-FV solves constant- and variable-coefficient elliptic problems on isotropic Cartesian grids using Full Multigrid (FMG). The method is second-order accurate in the max norm, as demonstrated by the FMG convergence. FMG interpolation (prolongation) is linear and V-cycle interpolation and restriction are piecewise constant. Recursive decomposition is used to construct a space filling curve akin to Z-Mort in order to distribute work among processes. Chebyshev polynomials are used for smoothing, preconditioned by the diagonal. FMG convergence is observed with a fourth order Chebyshev polynomial using a V(4,4) cycle. Thus convergence is reached in a total of 9 fine-grid operator applications (4 presmooths, residual, 4 postsmooths). This makes HPGMG-FV extremeley fast and energy efficient.


HPX (High Performance ParalleX) is a general purpose C runtime system for parallel and distributed applications of any scale. It strives to provide a unified programming model which transparently utilizes the available resources to achieve unprecedented levels of scalability. This library strictly adheres to the C11 Standard and leverages the Boost C++ Libraries which makes HPX easy to use, highly optimized, and very portable. HPX is developed for conventional architectures including Linux-based systems, Windows, Mac, and the BlueGene/Q, as well as accelerators such as the Xeon Phi.

The goal of HPX is to create a high quality, freely available, open source implementation of the ParalleX model for conventional systems, such as classic Linux based Beowulf clusters or multi-socket highly parallel SMP nodes. At the same time, we want to have a very modular and well designed runtime system architecture which would allow us to port our implementation onto new computer system architectures. We want to use real world applications to drive the development of the runtime system, coining out required functionalities and converging onto a stable API which will provide a smooth migration path for developers. The API exposed by HPX is modelled after the interfaces defined by the C11/14 ISO standard and adheres to the programming guidelines used by the Boost collection of C libraries.


HPXPI is an implementation of the XPI specification on top of the HPX runtime system. It is currently based on the XPI document version r313.

XPI (eXtreme Parallex Interface) is a programming interface for parallel applications and systems based on the ParalleX execution model. XPI provides a simple abstraction layer to the family of ParalleX implementation HPX runtime system software. As HPX evolves, XPI insulates application codes from such changes, ensuring stability of experimental application codes. XPI serves both as a target for source-to-source compilers of high-level languages and as a readible low-level programming interface syntax. XPI is experimental and supports current on-going sponsored research projects. Its long term future is entirely dependent on its resulting value; an unknown at this time. But it is motivated by a shortterm need to advance key project goals.


Bindings for HPX for various scripting languages, currently Python and Lua.


HSL (formerly the Harwell Subroutine Library) is a collection of state-of-the-art packages for large-scale scientific computation written and developed by the Numerical Analysis Group at the STFC Rutherford Appleton Laboratory and other experts. HSL offers users a high standard of reliability and has an international reputation as a source of robust and efficient numerical software. Among its best known packages are those for the solution of sparse linear systems of equations and sparse eigenvalue problems. MATLAB interfaces are offered for selected packages.

The Library was started in 1963 and was originally used at the Harwell Laboratory on IBM mainframes running under OS and MVS. Over the years, the Library has evolved and has been extensively used on a wide range of computers, from supercomputers to modern PCs. Recent additions include optimised support for multicore processors.

HSL packages are available at no cost for academic research and teaching. See download links for individual packages in the catalogue.


Bring the best of JavaScript data visualization to R. Use JavaScript visualization libraries at the R console, just like plots. Embed widgets in R Markdown documents and Shiny web applications. Develop new widgets using a framework that seamlessly bridges R and JavaScript.


Hugo is a general-purpose website framework. Technically speaking, Hugo is a static site generator. This means that, unlike systems like WordPress, Ghost and Drupal, which run on your web server expensively building a page every time a visitor requests one, Hugo does the building when you create your content. Since websites are viewed far more often than they are edited, Hugo is optimized for website viewing while providing a great writing experience.

Sites built with Hugo are extremely fast and very secure. Hugo sites can be hosted anywhere, including Heroku, GoDaddy, DreamHost, GitHub Pages, Amazon S3 and CloudFront, and work well with CDNs. Hugo sites run without dependencies on expensive runtimes like Ruby, Python or PHP and without dependencies on any databases.


The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, …​) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information as well as the locality of I/O devices such as network interfaces, InfiniBand HCAs or GPUs. It primarily aims at helping applications with gathering information about modern computing hardware so as to exploit it accordingly and efficiently.


An open source distributed consistent key-value datastore. It differentiates from other distributed key-value datastores by claiming to offer consistency and multi-dimensional hashing, on top of the usual performance, availability, and throughput guarantees. A performance test that measures performance of HyperDex using identical setup to an independent study that evaluates the performance of Cassandra, MongoDB, and HBase side-by-side shows HyperDex to have superior throughput and latency. Multi-dimensional hashing is achieved through a different mechanism called hyperspace hashing than BigTable’s multiple column approach. The consistency guarantee is achieved through a novel chaining protocol.

HyperDex provides one of the richest APIs in the NoSQL space. Its support datatypes is unparalleled by other sharded data stores. It supports bulk asynchronous operations. And it boasts the fastest, consistent, online backups in the industry. All of these are accessible with bindings from C, C++, Python, Ruby, Java, Go, and Rust.


HSQLDB (HyperSQL DataBase) is the leading SQL relational database software written in Java. It offers a small, fast multithreaded and transactional database engine with in-memory and disk-based tables and supports embedded and server modes. It includes a powerful command line SQL tool and simple GUI query tools.


IBEX is a C++ library for constraint processing over real numbers. It provides reliable algorithms for handling non-linear constraints. In particular, roundoff errors are also taken into account. It is based on interval arithmetic and affine arithmetic. The main feature of Ibex is its ability to build strategies declaratively through the contractor programming paradigm. It can also be used as a black-box solver.

It can be used to solve a variety of problems that can roughly be formulated as to find a reliable characterization with boxes (Cartesian product of intervals) of sets implicitely defined by constraints. Reliable means that all sources of uncertainty should be taken into account, including:

  • approximation of real numbers by floating-point numbers

  • round-off errors

  • linearization truncatures

  • model parameter uncertainty

  • measurement noise


ICALAB for Signal Processing and ICALAB for Image Processing are two independent demo packages for MATLAB that implement a number of efficient algorithms for ICA (independent component analysis) employing HOS (higher order statistics), BSS (blind source separation) employing SOS (second order statistics) and LP (linear prediction), and BSE (blind signal extraction) employing various SOS and HOS methods.


An open-source, distributed, time series database with no external dependencies. InfluxDB is a time series, metrics, and analytics database. It’s written in Go and has no external dependencies. That means once you install it there’s nothing else to manage (like Redis, ZooKeeper, HBase, or whatever). InfluxDB is targeted at use cases for DevOps, metrics, sensor data, and real-time analytics.


Instrumentino is an open-source modular graphical user interface framework for controlling Arduino based experimental instruments. It expands the control capability of Arduino by allowing instruments builders to easily create a custom user interface program running on an attached personal computer.It enables the definition of operation sequences and their automated running without user intervention.

Acquired experimental data and a usage log are automatically saved on the computer for further processing.

Complex devices, which are difficult to control using an Arduino, may be integrated as well by incorporating third party application programming interfaces (APIs) into the Instrumentino framework.

Interactive Spaces

Interactive Spaces is a software platform which allows you to merge the virtual world with the physical world. By making it easy to connect sensors to applications running on different machines in a space, quite complex behaviors can be built.

Interactive Spaces applications are build from units called Activities which can easily communicate with each other no matter where they are on the local network. Through the use of Interactive Spaces communication system, called a route, any activity in the space can speak to or listen to messages from any other activities that it chooses to. This means you can easily control and synchronize events across a collection of machines.


The Internet-in-a-Box is a small, inexpensive device which provides essential Internet resources without any Internet connection. It provides a local copy of half a terabyte of the world’s Free information. It provides:

  • Wikipedia: Complete Wikipedia in a dozen different languages

  • Maps: Zoomable world-wide maps down to street level

  • E-books: Over 35 thousand e-books in a variety of languages

  • Software: Huge library of Open Source Software, including installable Ubuntu Linux OS with all software package repositories. Includes full source code for study or modification.

  • Video: Hundreds of hours of instructional videos

  • Chat: Simple instant messaging across the community

While the complete dataset for the Internet-in-a-Box project is over 700 GB, you can install the 500 MB QuickStart Sampler dataset to try the software without needing the full dataset.

There are several methods to install and run Internet-in-a-Box (IIAB). For installation: you can install IIAB as a Python package using the Python package manager pip. Or you can install from source using git. To run: you can run IIAB as a stand-alone server, or you can integrate it with your Apache installation using WSGI as a gateway.


Trendy stuff about wee hardware running this sort of software.


A collaborative open-source software framework that makes it easy for devices and apps to discover and communicate with each other. It supports many language bindings and can be easily integrated into platforms small and large. The AllJoyn framework defines a common way for devices and apps to communicate with one another ushering a new wave of interoperable devices to make the Internet of Things a reality.

The AllJoyn framework handles the complexities of discovering nearby devices, creating sessions between devices, and communicating securely between those devices. It abstracts out the details of the physical transports and provides a simple-to-use API. Multiple connection session topologies are supported, including point-to-point and group sessions. The security framework is flexible, supporting many mechanisms and trust models. And the types of data transferred are also flexible, supporting raw sockets or abstracted objects with well-defined interfaces, methods, properties, and signals.

One of the defining traits of the AllJoyn framework is its inherent flexibility. It was designed to run on multiple platforms, ranging from small embedded RTOS platforms to full-featured OSes. It supports multiple language bindings and transports. And since the AllJoyn framework is open-source, this flexibility can be extended further in the future to support even more transports, bindings, and features.


ActiveMQ Apollo is a faster, more reliable, easier to maintain messaging broker built from the foundations of the original ActiveMQ. It accomplishes this using a radically different threading and message dispatching architecture. Like ActiveMQ, Apollo is a multi-protocol broker and supports STOMP, AMQP, MQTT, Openwire, SSL, and WebSockets.


The Constrained Application Protocol (CoAP) is a specialized web transfer protocol for use with constrained nodes and constrained networks in the Internet of Things. The protocol is designed for machine-to-machine (M2M) applications such as smart energy and building automation.

See also Erbium and txThings.


Californium (Cf) is an open source implementation of the Constrained Application Protocol (CoAP). It is written in Java and targets unconstrained environments such as back-end service infrastructures (e.g., proxies, resource directories, or cloud services) and less constrained environments such as embedded devices running Linux (e.g., smart home/factory controllers or cellular gateways). Californium (Cf) has been running code for the IETF standardization of CoAP and was recently reimplemented from scratch having all the experience. In particular, Cf focuses now on service scalability for large-scale Internet of Things applications. The new implementation was successfully tested at the ETSI CoAP and OMA LWM2M Plugtests in November 2013 and March 2014. It complies with all mandatory and optional test cases.


This implements a lightweight application-protocol for devices that are constrained their resources such as computing power, RF range, memory, bandwith, or network packet sizes. This protocol, CoAP was standardized in the IETF as RFC 7252.


GSN is a Java environment that runs on one or more computers composing the backbone of the acquisition network. A set of wrappers allow to feed live data into the system. Then, the data streams are processed according to XML specification files. The system is built upon a concept of sensors (real sensors or virtual sensors, that is a new data source created from live data) that are connected together in order to built the required processing path. For example, one can imagine an anemometer that would sent its data into GSN through a wrapper (various wrappers are already available and writing new ones is quick), then that data stream could be sent to an averaging mote, the output of this mote could then be split and sent for one part to a database for recording and to a web site for displaying the average measured wind in real time. All of this example could be done by editing only a few XML files in order to connect the various motes together.

GSN is designed to make the sensor network application development a pleasure. The applications based on GSN are hardware-independent making the sensor network changes invisible to the application, for instance you can change the underlying sensor network from the Mica2 nodes to the BTNodes (with compatible sensing boards) without ever touching a single line of code in the application.

Now you have all the common sensor network requirements in one package plus the support for dozens of well known sensing hardware.


IoTivity is an open source software framework enabling seamless device-to-device connectivity to address the emerging needs of the Internet of Things.


An integration middleware for the Internet of Things. It provides a communication stack for embedded devices based on IPv6, Web services and oBIX to provide interoperable interfaces for smart objects. Using 6LoWPAN for constrained wireless networks and the Constrained Application Protocol together with Efficient XML Interchange an efficient stack is provided allowing using interoperable Web technologies in the field of sensor and actuator networks and systems while remaining nearly as efficient regarding transmission message sizes as existing automation systems. The IoTSyS middleware aims providing a gateway concept for existing sensor and actuator systems found in nowadays home and building automation systems, a stack which can be deployed directly on embedded 6LoWPAN devices and further addresses security, discovery and scalability issues.


Kura is a Java/OSGi-based framework for IoT gateways. Kura APIs offer access to the underlying hardware (serial ports, GPS, watchdog, GPIOs, I2C, etc.), management of network configurations, communication with M2M/IoT Integration Platforms, and gateway management.


A project to improve the ease of wireless sensor network programming by allowing programmers to express high-level objectives, and leave the low-level details to the compiler and run-time system. The goal is also to enable easy integration with other systems, such as business systems, mainly via the usage of business process modeling.

A makeSense tutorial is available. The tutorial comes in a virtual machine (and the file to download is therefore quite large) that has all the software you need to learn and understand how to develop sensor network applications using makeSense. The tutorial has two parts. In the first part, domain experts can use our model editor (with extended BPMN) to develop sensor network applications. In the second part, programmers (also with limited or no knowledge on sensor networks) can learn how to use the makeSense macroprogramming language on how to develop sensor network applications.


The Mihini project delivers an embedded runtime running on top of Linux, that exposes a high-level Lua API for building Machine-to-Machine applications.


MQTT stands for MQ Telemetry Transport. It is a publish/subscribe, extremely simple and lightweight messaging protocol, designed for constrained devices and low-bandwidth, high-latency or unreliable networks. The design principles are to minimise network bandwidth and device resource requirements whilst also attempting to ensure reliability and some degree of assurance of delivery. These principles also turn out to make the protocol ideal of the emerging “machine-to-machine” (M2M) or “Internet of Things” world of connected devices, and for mobile applications where bandwidth and battery power are at a premium.


The Mosquitto project provides an open-source implementation of an MQTT broker. It implements the MQ Telemetry Transport protocol versions 3.1 and 3.1.1. MQTT provides a lightweight method of carrying out messaging using a publish/subscribe model. This makes it suitable for "machine to machine" messaging such as with low power sensors or mobile devices such as phones, embedded computers or microcontrollers like the Arduino.

The Mosquitto broker is the focus of the project and aims to be a lightweight and function MQTT broker that can run on relatively constrained systems, but still be powerful enough for a wide range of applications. The mosquitto_pub and mosquitto_sub command line utilities provide a straightforward and powerful way of interacting with your broker. The client library that the utilities use for their MQTT support can be used to develop your own MQTT applications.


A web interface for MQTT. A simple web interface which is able to subscribe to a MQTT topic and display the information.


An open source utility intended to help you with monitoring activity on MQTT topics. It’s been designed to deal with high volumes of messages, as well as occasional publications. A JavaFX application that should work on any operating system with an appropriate version of Java 8 installed. mqtt-spy-daemon is a Java-based command line tool that does not require a GUI environment. Basic functionality works with Java 7, whereas some of the advanced features like scripting require Java 8 to be installed.


The Paho project provides open-source client implementations of open and standard messaging protocols aimed at new, existing, and emerging applications for Machine‑to‑Machine (M2M) and Internet of Things (IoT).


The Internet of Things (IoT) is set to occupy a substantial component of future Internet. The IoT connects sensors and devices that record physical observations to applications and services of the Internet. As a successor to technologies such as RFID and Wireless Sensor Networks (WSN), the IoT has stumbled into vertical silos of proprietary systems, providing little or no interoperability with similar systems. As the IoT represents future state of the Internet, an intelligent and scalable architecture is required to provide connectivity between these silos, enabling discovery of physical sensors and interpretation of messages between things. This paper proposes a gateway and Semantic Web enabled IoT architecture to provide interoperability between systems using established communication and data standards. The Semantic Gateway as Service (SGS) allows translation between messaging protocols such as XMPP, CoAP and MQTT via a multi-protocol proxy architecture. Utilization of broadly accepted specifications such as W3C’s Semantic Sensor Network (SSN) ontology for semantic annotations of sensor data provide semantic interoperability between messages and support semantic reasoning to obtain higher-level actionable knowledge from low-level sensor data.


A software for integrating different home automation systems and technologies into one single solution that allows over-arching automation rules and that offers uniform user interfaces. The open Home Automation Bus (openHAB) project aims at providing a universal integration platform for all things around home automation. It is a pure Java solution, fully based on OSGi. The Equinox OSGi runtime and Jetty as a web server build the core foundation of the runtime.

It is designed to be absolutely vendor-neutral as well as hardware/protocol-agnostic. openHAB brings together different bus systems, hardware devices and interface protocols by dedicated bindings. These bindings send and receive commands and status updates on the openHAB event bus. This concept allows designing user interfaces with a unique look&feel, but with the possibility to operate devices based on a big number of different technologies. Besides the user interfaces, it also brings the power of automation logics across different system boundaries.


OpenRemote is software integration platform for residential and commercial building automation. OpenRemote platform is automation protocol agnostic, operates on off-the-shelf hardware and is freely available under an Open Source license. OpenRemote’s architecture enables fully autonomous and user-independent intelligent buildings. End-user control interfaces are available for iOS and Android devices, and for devices with modern web browsers. User interface design, installation management and configuration can be handled remotely with OpenRemote cloud-based design tools.

The supported protocols/devices include TCP/IP, Telnet, HTTP/REST, RS-232, AMX, KNX, Lutron, Z-Wave, 1-Wire, EnOcean, xPL, Insteon. X10, infrared, Russound, GlobalCache, ITRrans, VLC, panStamp, Denon AVR, Freebox, MythTV and more.


The OSGi specification describes a modular system and a service platform for the Java programming language that implements a complete and dynamic component model, something that does not exist in standalone Java/VM environments. Applications or components, coming in the form of bundles for deployment, can be remotely installed, started, stopped, updated, and uninstalled without requiring a reboot; management of Java packages/classes is specified in great detail. Application life cycle management is implemented via APIs that allow for remote downloading of management policies. The service registry allows bundles to detect the addition of new services, or the removal of services, and adapt accordingly.

The OSGi specifications have evolved beyond the original focus of service gateways, and are now used in applications ranging from mobile phones to the open-source Eclipse IDE. Other application areas include automobiles, industrial automation, building automation, PDAs, grid computing, entertainment, fleet management and application servers.


Remote Services for OSGi runs as an OSGi bundle and facilitates distribution for arbitrary OSGi framework implementations. All that a service provider framework has to do is registering a service for remote access. Subsequently, other peers can connect to the service provider peer and get access to the service. Remote services are accessed in an entirely transparent way. For every remote service, a local proxy bundle is generated that registers the same service. Local service clients can hence access the remote service in the same way and without regarding distribution.

Even though Remote Services for OSGi is a sophisticated middleware for OSGi frameworks, it uses a very efficient network protocol and has a small footprint. This makes it ideal for small and embedded devices with limited memory and network bandwidth. The service runs on every OSGi-compliant environment. Remote Services for OSGi has been tested with Eclipse Equinox, Knopflerfish, and Oscar / Apache Felix, as well as with our own lightweight OSGi implementation Concierge. Our test platforms include a variety of different devices, hardware architectures and Java VMs.


Our goal is to provide a programming environment where sensors can be programmed as an ensemble rather than individually. Here programmers will focus on the applications and on the services provided by the collection of sensors rather than on which particular sensor to program or on how communication will take place. Energy levels and scheduling will to some extent be controlled by the programmers, e.g., to give priority to specific application tasks. Briefly put, we will employ macro programming of networks of sensors rather than micro programming of individual sensors. We will focus on wireless sensor networks, but the general concept of ensemble programming is applicable to other multiprocessor areas such as grid computing, multicore, and server farms, like Google, etc. We will demonstrate that sensor networks can be programmed as an ensemble under severe constraints, including being devices with very limited capabilities, frequently failing sensors, mobility, energy constraints and harsh security threats.

ProFuN TG is a high-level programming environment for wireless sensor networks. It helps the application programmer with software design, deployment and maintenance. The tool is customizable: both user-defined tasks and user-defined objective functions for task mapping are possible.


Eclipse SmartHome is a framework for building smart home solutions. As such, it consists of a rich set of OSGi bundles that serve different purposes. Not all solutions that build on top of Eclipse SmartHome will require all of those bundles - instead they can choose what parts are interesting for them.


TANGO is a software toolkit for connecting things together, building control systems, and integrating system. It is free , open source and object-oriented. It is easy to use and is well adapted to solving simple and complex distributed problems. TANGO Controls has been used to build solutions for:

  • Distributed Control Systems (DCS) in which devices are controlled and monitored in a local distributed network

  • Supervisory Control And Data Acquisition (SCADA) systems in which remote devices are controlled and monitored centrally

  • Integrated Control Systems (ICS) in which different autonomous control systems are integrated into a central one

  • Interface Devices that run on small embedded platforms into a distributed control system

  • Internet of Things (IoT) applications in which arbitrary devices are controlled through the Internet

  • Machine to Machine (M2M) applications in which devices communicates with each other

  • System Integration Platforms in which different kind of software applications and systems are integrated into a central one

TANGO Controls is operating system independent and supports C++, Java and Python for all of the components.


Taurus is a python framework for both CLI and GUI tango applications. It is build on top of PyTango and PyQt. Taurus stands for TAngo User interface ‘R’ US.

the thing system

The Thing System is a set of software components and network protocols. Our steward software is written in node.js making it both portable and easily extensible. It can run on your laptop, or fit onto a small single board computer like the Raspberry Pi.

The steward is at the heart of the system and connects to Things in your home, whether those things are media players such as the Sonos or the Apple TV, your Nest thermostat, your INSTEON home control system, or your Philips Hue lightbulbs — whether your things are connected together via Wi-Fi, USB or Bluetooth Low Energy (BLE). The steward will find them and bring them together so they can talk to one another and perform magic.

Dozens of "things" are supported, with more on the way.

IoT Operating Systems


Contiki is an open source operating system for the Internet of Things. Contiki connects tiny low-cost, low-power microcontrollers to the Internet. Contiki provides powerful low-power Internet communication. Contiki supports fully standard IPv6 and IPv4, along with the recent low-power wireless standards: 6lowpan, RPL, CoAP. With Contiki’s ContikiMAC and sleepy routers, even wireless routers can be battery-operated.

Contiki is designed to run on classes of hardware devices that are severely constrained in terms of memory, power, processing power, and communication bandwidth. A typical Contiki system has memory on the order of kilobytes, a power budget on the order of milliwatts, processing speed measured in megahertz, and communication bandwidth on the order of hundreds of kilobits/second. This class of systems includes both various types of embedded systems as well as a number of old 8-bit computers. Despite providing multitasking and a built-in TCP/IP stack, Contiki only needs about 10 kilobytes of RAM and 30 kilobytes of ROM.[1] A full system, complete with a graphical user interface, needs about 30 kilobytes of RAM.


CALIPSO builds Internet Protocol (IP) connected smart object networks, but with novel methods to attain very low power consumption, thereby providing both interoperability and long lifetimes. CALIPSO leans on the significant body of work on sensor networks to integrate radio duty cycling and data-centric mechanisms into the IPv6 stack, something that existing work has not previously done. CALIPSO works at three layers: the network, the routing, and the application layer. We also revisit architectural decisions on naming, identification, and the use of middle-boxes.

CALIPSO works within the IETF/IPv6 framework, which includes the recent IETF RPL and CoAP protocols. This gives a structure for evaluation that has not previously been available. We use Contiki open source OS, Europe’s leading smart object OS, as the target development environment for prototyping and experimental evaluation.


Erbium (Er) is a low-power REST Engine for Contiki that was developed together with SICS (and a rare earth element that is found in the Ytterby mine near Stockholm). The REST Engine includes a comprehensive embedded CoAP implementation, which became the official one for the Contiki OS. It supports RFC 7252 together with blockwise transfers and observing.


RIOT is an operating system designed for the particular requirements of Internet of Things (IoT) scenarios. These requirements comprise a low memory footprint, high energy efficiency, real-time capabilities, a modular and configurable communication stack, and support for a wide range of low-power devices. RIOT provides a microkernel, utilities like cryptographic libraries, data structures (bloom filters, hash tables, priority queues), or a shell, different network stacks, and support for various microcontrollers, radio drivers, sensors, and configurations for entire platforms, e.g. TelosB or STM32 Discovery Boards.


IPFS is a distributed file system that seeks to connect all computing devices with the same system of files. In some ways, this is similar to the original aims of the Web, but IPFS is actually more similar to a single bittorrent swarm exchanging git objects.

It combines good ideas from Git, BitTorrent, Kademlia, SFS, and the Web. It is like a single bittorrent swarm, exchanging git objects. IPFS provides an interface as simple as the HTTP web, but with permanence built in.


IPredator provides you with an encrypted tunnel from your computer to the Internet. We are hiding your real IP address behind one of ours.


IRAF is the Image Reduction and Analysis Facility, a general purpose software system for the reduction and analysis of scientific data. IRAF is written and supported by the IRAF programming group at the National Optical Astronomy Observatories (NOAO) in Tucson, Arizona. IRAF includes a good selection of programs for general image processing and graphics applications, plus a large number of programs for the reduction and analysis of optical astronomy data within the NOAO package. External or layered packages are also available for the analysis of HST, XRAY and EUV data. IRAF provides a complete programming environment, which includes the Command Language script facility, the IMFORT Fortran programming interface, and the fully featured SPP/VOS programming environment in which the portable IRAF system is written.

See PyRAF.


Iris seeks to provide a powerful, easy to use, and community-driven Python library for analysing and visualising meteorological and oceanographic data sets.


IRPF90 is a Fortran programming environment which helps the development of large Fortran codes by applying the Implicit Reference to Parameters method (IRP).

In Fortran programs, the programmer has to focus on the order of the instructions: before using a variable, the programmer has to be sure that it has already been computed in all possible situations. For large codes, it is common source of error.

In IRPF90 most of the order of instructions is handled by the pre-processor, and an automatic mechanism guarantees that every entity is built before being used. This mechanism relies on the needs/needed by relations between the entities, which are built automatically.

Codes written with IRPF90 execute often faster than Fortran programs, are faster to write and easier to maintain.


The Integrated System for Imagers and Spectrometers (ISIS) is a free, specialized, digital image processing software package developed by the USGS for NASA. ISIS key feature is the ability to place many types of data in the correct cartographic location, enabling disparate data to be co-analyzed. ISIS also includes standard image processing applications such as contrast, stretch, image algebra, filters, and statistical analysis. ISIS can process two-dimensional images as well as three-dimensional cubes derived from imaging spectrometers. The production of USGS topographic maps of extraterrestrial landing sites relies on ISIS software. ISIS is able to process data from NASA and International spacecraft missions including Lunar Orbiter, Apollo, Voyager, Mariner 10, Viking, Galileo, Magellan, Clementine, Mars Global Surveyor, Cassini, Mars Odyssey, Mars Reconnaissance Orbiter, MESSENGER, Lunar Reconnaissance Orbiter, Chandrayaan, Dawn, and Kaguya.

The ISIS software is a valuable resource for planetary missions that require systematic data processing, products for planning, and research and analysis of derived data products. By using ISIS, missions can leverage millions of dollars of software development that NASA has paid for. However, before the power of ISIS can be applied to an instrument, a camera model and custom programs to ingest mission-specific ancillary data are necessary. Once an instrument is added to ISIS, it can support data processing pipelines, radiometric calibration, photometric calibration, band-to-band registration of multispectral data, ortho-rectification, construction of scientifically accurate and cosmetically pleasing mosaics, generation of control networks solutions and creation of topographic models.


A dynamic computer programming language.[5] It is most commonly used as part of web browsers, whose implementations allow client-side scripts to interact with the user, control the browser, communicate asynchronously, and alter the document content that is displayed.[5] It is also used in server-side network programming with runtime environments such as Node.js, game development and the creation of desktop and mobile applications. With the rise of the single-page web app and JavaScript-heavy sites, it is increasingly being used as a compile target for source-to-source compilers from both dynamic languages and static languages. In particular, Emscripten and highly optimised JIT compilers, in tandem with asm.js which is friendly to AOT compilers like OdinMonkey, have enabled C and C++ programs to be compiled into JavaScript and execute at near-native speeds, making JavaScript be considered the "assembly language of the web",[6] according to its creator and others.


A low-level, extraordinarily optimizable subset of JavaScript. It is an intermediate programming language consisting of a strict subset of the JavaScript language. It enables significant performance improvements for web applications that are written in statically-typed languages with manual memory management (such as C) and then translated to JavaScript by a source-to-source compiler. Asm.js does not aim to improve the performance of hand-written JavaScript code, nor does it enable anything other than enhanced performance.

It is intended to have performance characteristics closer to that of native code than standard JavaScript by limiting language features to those amenable to ahead-of-time optimization and other performance improvements.[2] By using a subset of JavaScript, asm.js is already supported by all major web browsers,[3] unlike alternative approaches such as Google Native Client. Mozilla Firefox was the first web browser to implement asm.js-specific optimizations, starting with Firefox 22.[4] The optimizations of Google Chrome’s V8 JavaScript engine in Chrome 28 made asm.js benchmarks more than twice as fast as prior versions of Chrome.

See Emscripten.


DynJS is an ECMAScript runtime for the JVM.


Nashorn’s goal is to implement a lightweight high-performance JavaScript runtime in Java with a native JVM. This Project intends to enable Java developers embedding of JavaScript in Java applications via JSR-223 and to develop free standing JavaScript applications using the jrunscript command-line tool.


Rhino is an open-source implementation of JavaScript written entirely in Java. It is typically embedded into Java applications to provide scripting to end users. It is embedded in J2SE 6 as the default Java scripting engine.

JavaScript Frameworks

Things that run on top of JavaScript.


An open source JavaScript framework for Cross-platform mobile, desktop, TV and web applications emphasizing object-oriented encapsulation and modularity.


SpiderMonkey is Mozilla’s JavaScript engine written in C/C++. It is used in various Mozilla products, including Firefox, and is available under the MPL2.

SpiderMonkey is the code name for the first-ever JavaScript engine, written by Brendan Eich at Netscape Communications, later released as open source and now maintained by the Mozilla Foundation. SpiderMonkey provides JavaScript support for Mozilla Firefox and various embeddings such as the GNOME 3 desktop.

Eich "wrote JavaScript in ten days" in 1995, having been "recruited to Netscape with the promise of doing Scheme in the browser". (The idea of using Scheme was abandoned when "engineering management [decided] that the language must ‘look like Java’".) In the fall of 1996, Eich, needing to "pay off [the] substantial technical debt" left from the first year, "stayed home for two weeks to rewrite Mocha as the codebase that became known as SpiderMonkey". The name SpiderMonkey was chosen as a reference to the movie Beavis and Butt-head Do America, in which the character Tom Anderson mentions that the title characters were "whacking off like a couple of spider monkeys." In 2011, Eich transferred management of the SpiderMonkey code to Dave Mandelin.


The V8 JavaScript Engine is an open source JavaScript engine developed by Google for the Google Chrome web browser. V8 compiles JavaScript to native machine code (IA-32, x86-64, ARM, or MIPS ISAs) before executing it, instead of more traditional techniques such as interpreting bytecode or compiling the whole program to machine code and executing it from a filesystem. The compiled code is additionally optimized (and re-optimized) dynamically at runtime, based on heuristics of the code’s execution profile. Optimization techniques used include inlining, elision of expensive runtime properties, and inline caching, among many others.


Jekyll is a simple, blog-aware, static site generator perfect for personal, project, or organization sites. Think of it like a file-based CMS, without all the complexity. Jekyll takes your content, renders Markdown and Liquid templates, and spits out a complete, static website ready to be served by Apache, Nginx or another web server. Jekyll is the engine behind GitHub Pages, which you can use to host sites right from your GitHub repositories.


Complex systems are increasingly being viewed as distributed information processing systems, particularly in the domains of computational neuroscience, bioinformatics and Artificial Life. This trend has resulted in a strong uptake in the use of (Shannon) information-theoretic measures to analyse the dynamics of complex systems in these fields. We introduce the Java Information Dynamics Toolkit (JIDT): a Google code project which provides a standalone, (GNU GPL v3 licensed) open-source code implementation for empirical estimation of information-theoretic measures from time-series data. While the toolkit provides classic information-theoretic measures (e.g. entropy, mutual information, conditional mutual information), it ultimately focusses on implementing higher-level measures for information dynamics. That is, JIDT focusses on quantifying information storage, transfer and modification, and the dynamics of these operations in space and time. For this purpose, it includes implementations of the transfer entropy and active information storage, their multivariate extensions and local or pointwise variants. JIDT provides implementations for both discrete and continuous-valued data for each measure, including various types of estimator for continuous data (e.g. Gaussian, box-kernel and Kraskov-Stoegbauer-Grassberger) which can be swapped at run-time due to Java’s object-oriented polymorphism. Furthermore, while written in Java, the toolkit can be used directly in MATLAB, GNU Octave, Python and other environments. We present the principles behind the code design, and provide several examples to guide users.


Joblib provides a simple helper class to write parallel for loops using multiprocessing. The core idea is to write the code to be executed as a generator expression, and convert it to parallel computing.


Jolie is an open-source programming language for developing distributed applications based on microservices. In the programming paradigm proposed with Jolie, each program is a service that can communicate with other programs by sending and receiving messages over a network.


An application launcher for portable Java applications on every computer everywhere you go. jPort creates a Java enabled menu to launch dozens of free applications. jPort desktop does not require installation. Simply upload jPort on any desktop and hundreds of awesome applications will be under your fingertips.


In a decentralized computing environment, it’s a better practice to pass programming codes to various machines to execute (and then gather the results) when the application is dealing with huge amount of data. However, how can machines of various configurations understand each other? Also, the "moving code, least moving data" policy may work better with functional programming than imperative programming.

Those questions/issues lead to the idea of doing functional programming in JSON. If programs can be coded in JSON, they can be easily shipped around and understood by machines of vaious settings. Combining JSON and functional programming also makes security issues easier to track or manage.

JSON-FP is part of an attempt to make data freely and easily accessed, distributed, annotated, meshed, even re-emerged with new values. To achieve that, it’s important to be able to ship codes to where data reside, and that’s what JSON-FP is trying to achieve.


A high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The library, largely written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing.


The next generation of IPython notebooks. IPython will continue to exist as a Python kernel for Jupyter, but the notebook and other language-agnostic parts of IPython will move to new projects under the Jupyter name. IPython 3.0 will be the last monolithic release of IPython.


This repository contains custom Contents classes that allows IPython to use Google Drive for file management. The code is a organized as a python package that contains functions to install a Jupyter Notebook JavaScript extension, and activate/deactivate different IPython profiles to be used with Google drive.


Multi-user server for Jupyter notebooks.


Jupyter nbviewer is the web application behind The Jupyter Notebook Viewer, which is graciously hosted by Rackspace. Run this locally to get most of the features of nbviewer on your own network.


A Java virtual machine (JVM) is an abstract computing machine. There are three notions of the JVM: specification, implementation, and instance. The specification is a book that formally describes what is required of a JVM implementation. Having a single specification ensures all implementations are interoperable. A JVM implementation is a computer program that implements requirements of the JVM specification in a compliant and preferably performant manner. An instance of the JVM is a process that executes a computer program compiled into Java bytecode.


An implementation of the Python programming language which is designed to run on the Java(tm) Platform. It consists of a compiler to compile Python source code down to Java bytecodes which can run directly on a JVM, a set of support libraries which are used by the compiled Java bytecodes, and extra support to make it trivial to use Java packages from within Jython.

Jython is an implementation of the Python language for the Java platform. Jython 2.5 implements the same language as CPython 2.5, and nearly all of the Core Python standard library modules. (CPython is the C implementation of the Python language.)


An environment for scientific computation, data analysis and data visualization designed for scientists, engineers and students. The program incorporates many open-source software packages into a coherent interface using the concept of dynamic scripting.

SCaVis can be used with several scripting languages for the Java platform, such as BeanShell, Jython (the Python programming language), Groovy and JRuby (Ruby programming language). This brings more power and simplicity for scientific computation. The programming can also be done in native Java. Finally, symbolic calculations can be done using Matlab/Octave high-level interpreted language.


Kahler, a Python library that implements discrete exterior calculus on arbitrary Hermitian manifolds. Borrowing techniques and ideas first implemented in PyDEC, Kahler provides a uniquely general framework for computation using discrete exterior calculus. Manifolds can have arbitrary dimension, topology, bilinear Hermitian metrics, and embedding dimension. Kahler comes equipped with tools for generating triangular meshes in arbitrary dimensions with arbitrary topology. Kahler can also generate discrete sharp operators and implement de Rham maps. Computationally intensive tasks are automatically parallelized over the number of cores detected. The program itself is written in Cython—​a superset of the Python language that is translated to C and compiled for extra speed. Kahler is applied to several example problems: normal modes of a vibrating membrane, electromagnetic resonance in a cavity, the quantum harmonic oscillator, and the Dirac-Kahler equation. Convergence is demonstrated on random meshes.


A simple and fast framework for spatial analysis in Python. It contains clean vector and raster data types that are coordinate system-aware, implementations of frequently-used of geospatial analysis methods, and the read/write interfaces to several formats, including GeoJSON, shapefiles, and ESRI ASCII.


Kartograph is a simple and lightweight framework for building interactive map applications without Google Maps or any other mapping service. It was created with the needs of designers and data journalists in mind. Actually, Kartograph is two libraries. One generates beautiful & compact SVG maps; the other helps you to create interactive maps that run across all major browsers.

The library is a Python library for generating beautiful, Illustrator-friendly SVG maps. The Kartograph.js library is a JavaScript library for creating interactive maps based on SVG maps.


Visualize logs and time-stamped data. Elasticsearch works seamlessly with Kibana to let you see and interact with your data.


Keyhole Markup Language (KML) is an XML notation for expressing geographic annotation and visualization within Internet-based, two-dimensional maps and three-dimensional Earth browsers. The KML file specifies a set of features (place marks, images, polygons, 3D models, textual descriptions, etc.) for display in Here Maps, Google Earth, Maps and Mobile, or any other geospatial software implementing the KML encoding. Each place always has a longitude and a latitude. Other data can make the view more specific, such as tilt, heading, altitude, which together define a "camera view" along with a timestamp or timespan. KML shares some of the same structural grammar as GML. Some KML information cannot be viewed in Google Maps or Mobile.

KML files are very often distributed in KMZ files, which are zipped KML files with a .kmz extension. These must be legacy (ZIP 2.0) compression compatible (i.e. stored or deflate method), otherwise the .kmz file might not uncompress in all geobrowsers. The contents of a KMZ file are a single root KML document (notionally "doc.kml") and optionally any overlays, images, icons, and COLLADA 3D models referenced in the KML including network-linked KML files. The root KML document by convention is a file named "doc.kml" at the root directory level, which is the file loaded upon opening. By convention the root KML document is at root level and referenced files are in subdirectories (e.g. images for overlay images).


Fastkml is a library to read, write and manipulate KML files. It aims to keep it simple and fast (using lxml if available). Fast refers to the time you spend to write and read KML files as well as the time you spend to get aquainted to the library or to create KML objects. It aims to provide all of the functionality that KML clients such as OpenLayers, Google Maps, and Google Earth provides.


The KPP kinetic preprocessor is a software tool that assists the computer simulation of chemical kinetic systems. The concentrations of a chemical system evolve in time according to the differential law of mass action kinetics. A numerical simulation requires an implementation of the differential laws and a numerical integration in time.

KPP translates a specification of the chemical mechanism into Fortran77, Fortran90, C, or Matlab simulation code that implements the concentration time derivative function, its Jacobian, and it Hessian, together with a suitable numerical integration scheme. Sparsity in Jacobian/Hessian is carefully exploited in order to obtain computational efficiency.

KPP incorporates a library with several widely used atmospheric chemistry mechanisms; the users can add their own chemical mechanisms to the library. KPP also includes a comprehensive suite of stiff numerical integrators. The KPP development environment is designed in a modular fashion and allows for rapid prototyping of new chemical kinetic schemes as well as new numerical integration methods.


Krita is a KDE program for sketching and painting, offering an end–to–end solution for creating digital painting files from scratch by masters. Fields of painting that Krita explicitly supports are concept art, creation of comics and textures for rendering. Modeled on existing real-world painting materials and workflows, Krita supports creative working by getting out of the way and with a snappy response. There are three versions of Krita: Krita Sketch, for touch devices, Krita Desktop desktop systems and finally Krita Studio, which is like Krita Desktop but supported by KO GmbH.


A free and open source video compositing software, similar in functionality to Adobe After Effects or Nuke by The Foundry. The project is a free node-based compositor that relies on OpenColorIO for color management, OpenImageIO for file formats support, and Qt for user interface. It also works with 32bit float per channel precision and supports OFX plugins, both free and commercial.


The KSTAR project supports the development of Klang, a source-to-source compiler that turns C programs with OpenMP pragmas to C programs with calls to either the StarPU or the Kaapi runtime system. The features include OpenMP 3.1, the OpenMP 4.0 depend clause, Accelerators extensions, and C/Cpp source-to-source translation based on Clang.


A unified runtime system for heterogeneous multicore architectures.

Software that uses StarPU to run on heterogeneous architectures includes MAGMA, SkePU and PaStiX.


A runtime for scheduling irregular fine grain tasks with data flow dependencies. It could be used through OpenMP-4.0 compliant applications using GNU C or C compiler, Intel compilers or our the research C/C source-to-source compiler KSTAR. It is a C library that allows to execute multithreaded computation with data flow synchronization between threads. The library is able to schedule fine/medium size grain program on distributed machine. The data flow graph is dynamic (unfold at runtime). Target architectures are clusters of SMP machines.


LabPlot is an application for interactive graphing and analysis of scientific data.


A modern open-source JavaScript library for mobile-friendly interactive maps. Leaflet is designed with simplicity, performance and usability in mind. It works efficiently across all major desktop and mobile platforms out of the box, taking advantage of HTML5 and CSS3 on modern browsers while still being accessible on older ones. It can be extended with a huge amount of plugins.


Folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js library. Manipulate your data in Python, then visualize it in on a Leaflet map via Folium.

Folium makes it easy to visualize data that’s been manipulated in Python on an interactive Leaflet map. It enables both the binding of data to a map for choropleth visualizations as well as passing Vincent/Vega visualizations as markers on the map.

The library has a number of built-in tilesets from OpenStreetMap, Mapbox, and Stamen, and supports custom tilesets with Mapbox or Cloudmade API keys. Folium supports both GeoJSON and TopoJSON overlays, as well as the binding of data to those overlays to create choropleth maps with color-brewer color schemes.


libeemd is a C library for performing the ensemble empirical mode decomposition (EEMD), its complete variant (CEEMDAN) or the regular empirical mode decomposition (EMD). It includes a Python interface called pyeemd. The details of what libeemd actually computes are available as a separate article, which you should read if you are unsure about what EMD, EEMD and CEEMDAN are.


This acts as a highly efficient multi-dimensional array of arbitrary objects, but really uses a struct of arrays memory layout. It’s great for writing vectorized code and its lightning-fast iterators give you access to neighboring elements with zero address generation overhead.


LibGeoDecomp (Library for Geometric Decomposition codes) is an auto-parallelizing library for computer simulations. It is written in C and works best with kernels written in C, but other languages (e.g. Fortran) may be linked in, too. Thanks to its modular design the library can harness all state of the art hardware architectures, e.g. multi-core CPUs, GPUs (currently only NVIDIA GPUs, via CUDA), Intel Xeon Phi, MPI clusters and Raspberry Pi.

The library takes over the spatial and temporal loops of the simulation as well as storage of the simulation data. It will call back the user code for performing the actual computations. User code in turn calls back the library to access simulation data. Thanks to this two-way callback the library can control which part of the code runs when.

Users can build custom computer simulations (e.g. engineering or natural sciences problems) by encapsulating their model in a C++ class. This class is then supplied to the library as a template parameter. The library essentially relieves the user from the pains of parallel programming, but is limited to applications which perform space- and time-discrete simulations with only local interactions.


A set of tools for accessing and modifying virtual machine (VM) disk images. You can use this for viewing and editing files inside guests, scripting changes to VMs, monitoring disk used/free statistics, creating guests, P2V, V2V, performing backups, cloning VMs, building VMs, formatting disks, resizing disks, and much more.

libguestfs can access almost any disk image imaginable. It can do it securely — without needing root and with multiple layers of defence against rogue disk images. It can access disk images on remote machines or on CDs/USB sticks. It can access proprietary systems like VMware and Hyper-V.

All this functionality is available through a scriptable shell called guestfish, or an interactive rescue shell virt-rescue.

libguestfs is a C library that can be linked with C and C++ management programs and has bindings for about a dozen other programming languages. Using our FUSE module you can also mount guest filesystems on the host.


A multi-platform support library with a focus on asynchronous I/O. It was primarily developed for use by Node.js, but it’s also used by Luvit, Julia, pyuv, and others.


Lighthouse is a framework for creating, maintaining, and using a taxonomy of available software that can be used to build highly-optimized matrix algebra computations. The taxonomy provides an organized anthology of software components and programming tools needed for that task. The taxonomy will serve as a guide to practitioners seeking to learn what is available for their programming tasks, how to use it, and how the various parts fit together. It builds upon and improves existing collections of numerical software, adding tools for the tuning of matrix algebra computations.


Limulus is an acronym for LInux MULti-core Unified Supercomputer. The Limulus project goal is to create and maintain an open specification and software stack for a personal workstation cluster. Ideally, a user should be able to build or purchase a small personal workstation cluster using the Limulus reference design and low cost hardware. In addition, a freely available turn-key Linux based software stack will be created and maintained for use on the Limulus design. A Limulus is inteneded to be a workstation cluster platform where users can develop software, test ideas, run small scale applications, and teach HPC methods.


LinBox is a C++ template library for exact, high-performance linear algebra computation with dense, sparse, and structured matrices over the integers and over finite fields. LinBox aims to provide world-class high performance implementations of the most advanced algorithms for exact linear algebra.

LinBox was originally designed primarily to work with sparse and structured matrices, which are defined in this context as those matrices where the computational cost of application of an m by n matrix to a vector is significantly less than O(mn), the cost for a dense matrix. Now, increasingly, LinBox also has codes for dense matrix computations using floating point BLAS routines for speed while not sacrificing exactness.

LinBox implements iterative system-solving methods such as those of Wiedemann and Lanczos to operate on very large, sparse linear systems. This avoids the potential fill-in associated with elimination-based methods and keeps memory use relatively constant through the computation. These methods allow LinBox to be used to solve systems with hundreds of thousands of equations and hundreds of thousands of variables.


LinuxCNC (the Enhanced Machine Control) is a software system for computer control of machine tools such as milling machines and lathes. It provides:

  • several graphical user interfaces including one for touch screens

  • an interpreter for "G-code" (the RS-274 machine tool programming language)

  • a realtime motion planning system with look-ahead

  • operation of low-level machine electronics such as sensors and motor drives

  • an easy to use "breadboard" layer for quickly creating a unique configuration for your machine

  • a software PLC programmable with ladder diagrams

  • easy installation with .deb packages or a Live-CD

It does not provide drawing (CAD - Computer Aided Design) or G-code generation from the drawing (CAM - Computer Automated Manufacturing) functions.

It can simultaneously move up to 9 axes and supports a variety of interfaces. The control can operate true servos (analog or PWM) with the feedback loop closed by the LinuxCNC software at the computer, or open loop with "step-servos" or stepper motors. Motion control features include: cutter radius and length compensation, path deviation limited to a specified tolerance, lathe threading, synchronized axis motion, adaptive feedrate, operator feed override, and constant velocity control. Support for non-Cartesian motion systems is provided via custom kinematics modules. Available architectures include hexapods (Stewart platforms and similar concepts) and systems with rotary joints to provide motion such as PUMA or SCARA robots. LinuxCNC runs on Linux using real time extensions. Support currently exists for version 2.4 and 2.6 Linux kernels with real time extensions applied by RT-Linux or RTAI patches.

Linux Diminutives

Linux distributions that aren’t (necessarily) lesser but rather smaller in size.


CoreOS is a new Linux distribution that has been rearchitected to provide features needed to run modern infrastructure stacks. The strategies and architectures that influence CoreOS allow companies like Google, Facebook and Twitter to run their services at scale with high resilience.


Welcome to OpenEmbedded, the build framework for embedded Linux. OpenEmbedded offers a best-in-class cross-compile environment. It allows developers to create a complete Linux Distribution for embedded systems. The OpenEmbedded-Core Project (OE-Core for short) resulted from the merge of the Yocto Project with OpenEmbedded.


The Yocto Project is an open source collaboration project that provides templates, tools and methods to help you create custom Linux-based systems for embedded products regardless of the hardware architecture. It was founded in 2010 as a collaboration among many hardware manufacturers, open-source operating systems vendors, and electronics companies to bring some order to the chaos of embedded Linux development.

The Yocto Project provides resources and information catering to both new and experienced users, and includes core system component recipes provided by the OpenEmbedded project. The Yocto Project also provides pointers to example code built demonstrating its capabilities. These community-tested images include the Yocto Project kernel and cover several build profiles across multiple architectures including ARM, PPC, MIPS, x86, and x86-64. Specific platform support takes the form of Board Support Package (BSP) layers for which a standard format has been developed. The project also provides an Eclipse IDE plug-in and a graphical user interface to the build system called Hob.


Replicant is a fully free Android distribution running on several devices, a free software mobile operating system putting the emphasis on freedom and privacy/security. It aims to replace all proprietary Android components with their free software counterparts. This also makes it a security focused operating system as it closes discovered Android backdoors.[4] It is available for several smartphones and tablet computers.


An operating system based on the Linux kernel and the GNU C Library implementing the Linux API. It targets a very wide range of devices including smartphones, tablets, in-vehicle infotainment (IVI) devices, smart TVs, PCs, smart cameras, wearable computing (such as smartwatches), Blu-ray players, printers and smart home appliances[3] (such as refrigerators, lighting, washing machines, air conditioners, ovens/microwaves and a robotic vacuum cleaner[4]). Its purpose is to offer a consistent user experience across devices. Tizen is a project within the Linux Foundation and is governed by a Technical Steering Group (TSG) composed of Samsung and Intel among others.

HTML5 applications run on Tizen, Android, Firefox OS, Ubuntu Touch, Windows Phone, and webOS without a browser. Applications based on Qt, GTK+ and EFL frameworks can run on Tizen IVI.[28] While there is no official support for these third-party frameworks, according to the explanation on the Tizen SDK Web site,[29] Tizen applications for mobile devices can be developed without relying on an official Tizen IDE as long as the application complies with Tizen packaging rules. In May 2013, a community port of Qt to Tizen focused on delivering native GUI controls and integration of Qt with Tizen OS features for smartphones.[30] Based on the Qt port to Tizen, Tizen and mer can interchange code.


WebOS, also known as webOS, LG webOS, Open webOS, or HP webOS, is a Linux kernel-based multitask operating system for smart devices like TVs,[1] and smartwatches;[2] and was formerly a mobile operating system.[3] Initially developed by Palm, which was acquired by Hewlett-Packard, HP made the platform open source, and it became Open webOS.

The WebOS mobile platform introduced features so innovative that some are still in use by Apple, Microsoft and Google on their mobile operating systems iOS, Windows Phone, and Android, respectively.

Linux Distributions

More for unusual and/or cute ones than those that are more well known.

Linux From Scratch

Linux From Scratch (LFS) is a project that provides you with step-by-step instructions for building your own custom Linux system, entirely from source code.


The LOCKSS Program, based at Stanford University Libraries, provides libraries and publishers with award-winning, low-cost, open source digital preservation tools to preserve and provide access to persistent and authoritative digital content.

We recommend installing the LOCKSS software on a dedicated server or a virtual machine. We provide the LOCKSS software integrated with a Linux installation based on CentOS 6.


A Linux distribution with a unique approach to package and configuration management. Built on top of the Nix package manager, it is completely declarative, makes upgrading systems reliable, and has many other advantages.


Openwall GNU/*/Linux (or Owl for short) is a small security-enhanced Linux distribution for servers, appliances, and virtual appliances. Owl live CDs with remote SSH access are also good for recovering or installing systems (whether with Owl or not). Another secondary use is for operating systems and/or computer security courses, which benefit from the simple structure of Owl and from our inclusion of the complete build environment.

Linux Initialization

See below.


A highly-available key value store for shared configuration and service discovery. A component of CoreOS.


This ties together systemd and etcd into a distributed init system. Think of it as an extension of systemd that operates at the cluster level instead of the machine level. This project is very low level and is designed as a foundation for higher order orchestration.


In Unix-based computer operating systems, init (short for initialization) is the first process started during booting of the computer system. Init is a daemon process that continues running until the system is shut down. It is the direct or indirect ancestor of all other processes and automatically adopts all orphaned processes. Init is started by the kernel using a hard-coded filename; a kernel panic will occur if the kernel is unable to start it. Init is typically assigned process identifier 1.

The design of init has diverged in Unix systems such as System III and System V, from the functionality provided by the init in Research Unix and its BSD derivatives. The usage on most Linux distributions is somewhat compatible with System V, but some distributions, such as Slackware, use a BSD-style and others, such as Gentoo, have their own customized version.

Several replacement init implementations have been written with attempt to address design limitations in the standard versions. These include launchd, the Service Management Facility, systemd and Upstart.


A suite of basic building blocks for a Linux system. It provides a system and service manager that runs as PID 1 and starts the rest of the system. systemd provides aggressive parallelization capabilities, uses socket and D-Bus activation for starting services, offers on-demand starting of daemons, keeps track of processes using Linux control groups, supports snapshotting and restoring of the system state, maintains mount and automount points and implements an elaborate transactional dependency-based service control logic. systemd supports SysV and LSB init scripts and works as a replacement for sysvinit. Other parts include a logging daemon, utilities to control basic system configuration like the hostname, date, locale, maintain a list of logged-in users and running containers and virtual machines, system accounts, runtime directories and settings, and daemons to manage simple network configuration, network time synchronization, log forwarding, and name resolution.


Upstart is an event-based replacement for the /sbin/init daemon which handles starting of tasks and services during boot, stopping them during shutdown and supervising them while the system is running. It was originally developed for the Ubuntu distribution, but is intended to be suitable for deployment in all Linux distributions as a replacement for the venerable System-V init.

Linux Package Managers


The software at the base of the package management system in the free operating system Debian and its numerous derivatives. dpkg is used to install, remove, and provide information about .deb packages.

dpkg itself is a low level tool; higher level tools, such as APT, are used to fetch packages from remote locations or deal with complex package relations. Tools like aptitude or synaptic are more commonly used than dpkg on its own, as they have a more sophisticated way of dealing with package relationships and a friendlier interface.


Aptitude is an Ncurses based FrontEnd to Apt, the debian package manager. Since it is text based, it is run from a terminal or a CLI (command line interface).

dkpg on Fedora

An RPM file containing dpkg for Fedora distributions.


Nix is a powerful package manager for Linux and other Unix systems that makes package management reliable and reproducible. It provides atomic upgrades and rollbacks, side-by-side installation of multiple versions of a package, multi-user package management and easy setup of build environments.


The Nix Packages collection (Nixpkgs) is a set of nearly 6,500 packages for the Nix package manager, released under a permissive MIT/X11 license. On GNU/Linux, the packages in Nixpkgs are ‘pure’, meaning that they have no dependencies on packages outside of the Nix store. This means that they should work on pretty much any GNU/Linux distribution.


OpenPKG provides a flexible and extensive toolkit of about 1500 portable and high-quality Unix server software packages within a fully self-contained packaging framework. OpenPKG 4 supports all major Unix server platforms, including BSD, GNU/Linux, Solaris and MacOS X flavors, and can be deployed multiple times on a single system without virtualization technologies and with minimum intrusion. The OpenPKG software distribution is updated daily and hence always provides you with the latest Open Source server software.


The RPM Package Manager (RPM) is a powerful command line driven package management system capable of installing, uninstalling, verifying, querying, and updating computer software packages. Each software package consists of an archive of files along with information about the package like its version, a description, and the like. There is also a library API, permitting advanced developers to manage such transactions from programming languages such as C or Python.



Antik provides a foundation for scientific and engineering computation in Common Lisp. It is designed not only to facilitate numerical computations, but to permit the use of numerical computation libraries and the interchange of data and procedures, whether foreign (non-lisp) or Lisp libraries. It is named after the Antikythera mechanism, one of the oldest examples of a scientific computer known.


In a nutshell, Hy is a Lisp dialect, but one that converts its structure into Python …​ literally a conversion into Python’s abstract syntax tree! (Or to put it in more crude terms, Hy is lisp-stick on a Python!)

This is pretty cool because it means Hy is several things:

  • A Lisp that feels very Pythonic

  • For Lispers, a great way to use Lisp’s crazy powers but in the wide world of Python’s libraries (why yes, you now can write a Django application in Lisp!)

  • For Pythonistas, a great way to start exploring Lisp, from the comfort of Python!

  • For everyone: a pleasant language that has a lot of neat ideas!


Quicklisp makes it easy to get started with a rich set of community-developed Common Lisp libraries. Quicklisp is a library manager for Common Lisp. It works with your existing Common Lisp implementation to download, install, and load any of over 1,100 libraries with a few simple commands. Quicklisp is easy to install and works with ABCL, Allegro CL, Clozure CL, CLISP, CMUCL, ECL, LispWorks, SBCL, and Scieneer CL, on Linux.


Livingstone2 is a reusable artificial intelligence (AI) software system designed to assist spacecraft, life support systems, chemical plants or other complex systems in operating robustly with minimal human supervision, even in the face of hardware failures or unexpected events. Livingstone2 diagnoses the current state of the spacecraft or other system and recommends commands or repair actions that will allow the system to continue operations.

Livingstone2 is an enhancement and re-engineering of the Livingstone diagnosis system that was flight tested on-board the Deep Space One spacecraft in May 1999. It contains significant enhancements to robustness, performance and usability. Livingstone2 is able to track multiple diagnostic hypotheses, as opposed to a single hypothesis in Livingstone. It is also able to revise diagnostic decisions made in the past when additional observations become available. In such cases, Livingstone might find the incorrect hypothesis. These improvements increase robustness.

Re-architecting and re-implementing the system in C++ has increased performance. Usability has been vastly improved by creating a set of development tools which are closely integrated with the Livingstone2 engine. In addition to the core diagnosis engine, Livingstone2 now includes a compiler than translates diagnostic models written in a Java-like language into Livingstone2’s language, and a broad set of graphical tools for model development. These software tools support the rapid deployment of model-based representations of complex systems for Livingstone2 via a visual model builder/tester (Stanley), and two graphical user interface tools (Candidate Manager and History Table) which provide Livingstone2 status information during testing. Runtime support is provided by the real-time interface (RTI) which converts analog sensor readings to the digital values required by Livingstone2.

Also included in the Livingstone2 download is Oliver, a prototype model builder/tester, which is however incomplete, but could be used as a starting place for a new model builder/tester.


The LLVM compiler infrastructure project (formerly Low Level Virtual Machine) is a compiler infrastructure designed as a set of reusable libraries with well-defined interfaces. It is written in C and is designed for compile-time, link-time, run-time, and "idle-time" optimization of programs written in arbitrary programming languages. Originally implemented for C and C, the language-agnostic design (and the success) of LLVM has since spawned a wide variety of front ends: languages with compilers that use LLVM include Common Lisp, ActionScript, Ada, D, Fortran, OpenGL Shading Language, Go, Haskell, Java bytecode, Julia, Objective-C, Swift, Python, Ruby, Rust, Scala, C# and Lua.


This project aims to fully build the Linux kernel using Clang which is the C front end for the LLVM compiler infrastructure project. Together Clang and LLVM have many positive attributes and features which many developers and system integrators would like to take advantage of when developing and deploying the Linux Kernel as a part of their own projects.


Pure is a modern-style functional programming language based on term rewriting. It offers equational definitions with pattern matching, full symbolic rewriting capabilities, dynamic typing, eager and lazy evaluation, lexical closures, built-in list and matrix support and an easy-to-use C interface. The interpreter uses LLVM as a backend to JIT-compile Pure programs to fast native code.


Implementation of the LLVM tutorial in Python.

Loopy lets you easily generate the tedious, complicated code that is necessary to get good performance out of GPUs and multi-core CPUs.

Loopy’s core idea is that a computation should be described simply and then transformed into a version that gets high performance. This transformation takes place under user control, from within Python.


LROSE is an NSF-backed project to develop common software for the LIDAR, RADAR and PROFILER community.


The Larval TRANSport Lagrangian model (LTRANS v.2b) is an off-line particle-tracking model that runs with the stored predictions of a 3D hydrodynamic model, specifically the Regional Ocean Modeling System (ROMS). Although LTRANS was built to simulate oyster larvae, it can easily be adapted to simulate passive particles and other planktonic organisms. LTRANS v.2 is written in Fortran 90 and is designed to track the trajectories of particles in three dimensions. It includes a 4th order Runge-Kutta scheme for particle advection and a random displacement model for vertical turbulent particle motion. Reflective boundary conditions, larval behavior, and settlement routines are also included.


LuxMark is a OpenCL benchmark tool. The idea for the program was conceived in 2009 by Jean-Francois Jromang Romang. It was intended as a promotional tool for LuxRender (to quote original Jromang’s words: "LuxRender propaganda with OpenCL"). The idea was quite simple, wrap SLG inside an easy to use graphical user interface and use it as a benchmark for OpenCL.


The Linux Virtual Server is a highly scalable and highly available server built on a cluster of real servers, with the load balancer running on the Linux operating system. The architecture of the server cluster is fully transparent to end users, and the users interact as if it were a single high-performance virtual server.

The real servers and the load balancers may be interconnected by either high-speed LAN or by geographically dispersed WAN. The load balancers can dispatch requests to the different servers and make parallel services of the cluster to appear as a virtual service on a single IP address, and request dispatching can use IP load balancing technolgies or application-level load balancing technologies. Scalability of the system is achieved by transparently adding or removing nodes in the cluster. High availability is provided by detecting node or daemon failures and reconfiguring the system appropriately.


Matrix algebra on GPU and multicore architectures. The MAGMA project aims to develop a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures, starting with current Multicore+GPU systems.


clMAGMA is an OpenCL port of MAGMA. It supports AMD GPUs. The clMAGMA library dependancies, in particular optimized GPU OpenCL BLAS and CPU optimized BLAS and LAPACK for AMD hardware, can be found in the AMD clMath Libraries (formerly APPML).


Mal is an Clojure inspired Lisp interpreter.

Mal is implemented in 26 different languages.

Mal is a learning tool. Each implementation of mal is separated into 11 incremental, self-contained (and testable) steps that demonstrate core concepts of Lisp. The last step is capable of self-hosting (running the mal implemenation of mal).


Mantevo is a multi-faceted application performance project. It provides application performance proxies known as miniapps. Miniapps combine some or all of the dominant numerical kernels contained in an actual stand-alone application. Miniapps include libraries wrapped in a test driver providing representative inputs. They may also be hard-coded to solve a particular test case so as to simplify the need for parsing input files and mesh descriptions. Mini apps range in scale from partial, performance-coupled components of the application to a simplified representation of a complete execution path through the application.


TeaLeaf is a mini-app that solves the linear heat conduction equation on a spatially decomposed regularly grid using a 5 point stencil with implicit solvers. TeaLeaf currently solves the equations in two dimensions, but three dimensional support is in beta.

The solvers have been written in Fortran with OpenMP and MPI and they have also been ported to OpenCL to provide an accelerated capability. Other versions invoke third party linear solvers and currently include Petsc, Trilinos and Hypre, which are in beta release. For each of these version there are instructions on how to download, build and link in the relevant library.

Mapbox Studio

Mapbox Studio gives you instant streaming access to massive global datasets like Mapbox Streets, Mapbox Terrain, and Mapbox Satellite without importing any data onto your computer.

Create your own vector tiles using Mapbox Studio. Convert data from traditional formats (Shapefile, GeoJSON, KML, GPX) and upload directly to Mapbox to deploy your vector tiles at scale.


A flexible and complete framework for building rich web-mapping applications. It emphasizes high productivity, and high-quality development. MapFish is based on the Pylons Python web framework. MapFish extends Pylons with geospatial-specific functionality. For example MapFish provides specific tools for creating web services that allows querying and editing geographic objects.

MapFish also provides a complete RIA-oriented JavaScript toolbox, a JavaScript testing environment, and tools for compressing JavaScript code. The JavaScript toolbox is composed of the ExtJS, OpenLayers , GeoExt JavaScript toolkits.

MapFish is compliant with the Open Geospatial Consortium standards. This is achieved through OpenLayers or GeoExt supporting several OGC norms, like WMS, WFS, WMC, KML, GML etc..

MapFish Print

MapFish Print allows printing maps as PDFs. It is written in Java and typically executed as a servlet in a servlet container such as Apache Tomcat.


Mapnik is a Free Toolkit for developing mapping applications. It’s written in C++ and there are Python bindings to facilitate fast-paced agile development. It can comfortably be used for both desktop and web development, which was something I wanted from the beginning.

Mapnik is about making beautiful maps. It uses the AGG library and offers world class anti-aliasing rendering with subpixel accuracy for geographic data. It is written from scratch in modern C++ and doesn’t suffer from design decisions made a decade ago. When it comes to handling common software tasks such as memory management, filesystem access, regular expressions, parsing and so on, Mapnik doesn’t re-invent the wheel, but utilizes best of breed industry standard libraries from

Mapnik uses a plugin architecture to read different datasources. Current plugins can read ESRI shapefiles, PostGIS, TIFF raster, OSM xml, Kismet, as well as all OGR/GDAL formats.

See OGCServer.


An Open Source platform for publishing spatial data and interactive mapping applications to the web. MapServer is an Open Source geographic data rendering engine written in C. Beyond browsing GIS data, MapServer allows you create “geographic image maps”, that is, maps that can direct users to content.


EOxServer is a Python application and framework for presenting Earth Observation (EO) data and metadata. It implements the OGC Implementation Specifications EO-WCS and EO-WMS on top of MapServer’s WCS and WMS implementations.

EOxServer is an open source software for registering, processing, and publishing Earth Observation (EO) data via different Web Services. EOxServer is written in Python and relies on widely-used libraries for geospatial data manipulation.

The core concept of the EOxServer data model is the one of a coverage. In this context, a coverage is a mapping from a domain set (a geographic region of the Earth described by its coordinates) to a range set. For original EO data, the range set usually consists of measurements of some physical quantity (e.g. radiation for optical instruments).


MathFu is a C++ math library developed primarily for games focused on simplicity and efficiency.

It provides a suite of vector, matrix and quaternion classes to perform basic geometry suitable for game developers. This functionality can be used to construct geometry for graphics libraries like OpenGL or perform calculations for animation or physics systems.

Matlab/Octave Translation


A package for converting Matlab or Octave to C++.

Matlab Toolboxes


Bayesian Compressive Sensing (BCS) is a Bayesian framework for solving the inverse problem of compressive sensing (CS). The basic BCS algorithm adopts the relevance vector machine (RVM), and later it is extended by marginalizing the noise variance with improved robustness. This is a MatLab 7.0 implementation of BCS, VB-BCS (BCS implemented via a variational Bayesian (VB) approach), TS-BCS for wavelet and for block-DCT implemented via both MCMC approach and VB approach.


The Curvelet transform is a higher dimensional generalization of the Wavelet transform designed to represent images at different scales and different angles. Curvelets enjoy two unique mathematical properties, namely:

  • Curved singularities can be well approximated with very few coefficients and in a non-adaptive manner - hence the name "curvelets."

  • Curvelets remain coherent waveforms under the action of the wave equation in a smooth medium.

By releasing the CurveLab toolbox, we hope to encourage the dissemination of curvelets to image processing, inverse problems and scientific computing.


Maven is a build automation tool used primarily for Java projects. The word maven means accumulator of knowledge in Yiddish.[3] Maven addresses two aspects of building software: First, it describes how software is built, and second, it describes its dependencies. Contrary to preceding tools like Apache Ant, it uses conventions for the build procedure, and only exceptions need to be written down. An XML file describes the software project being built, its dependencies on other external modules and components, the build order, directories, and required plug-ins. It comes with pre-defined targets for performing certain well-defined tasks such as compilation of code and its packaging. Maven dynamically downloads Java libraries and Maven plug-ins from one or more repositories such as the Maven 2 Central Repository, and stores them in a local cache.[4] This local cache of downloaded artifacts can also be updated with artifacts created by local projects. Public repositories can also be updated.


MBDyn is the first and possibly the only free* general purpose Multibody Dynamics analysis software. It features the integrated multidisciplinary simulation of multibody, multiphysics systems, including nonlinear mechanics of rigid and flexible bodies (geometrically exact & composite-ready beam and shell finite elements, component mode synthesis elements, lumped elements) subjected to kinematic constraints, along with smart materials, electric networks, active control, hydraulic networks, and essential fixed-wing and rotorcraft aerodynamics.

MBDyn simulates the behavior of heterogeneous mechanical, aeroservoelastic systems based on first principles equations. It can be easily coupled to external solvers for co-simulation of multiphysics problems, e.g. Computational Fluid Dynamics (CFD), terradynamics, block-diagram solvers like Scicos, Scicoslab and Simulink, using a simple C, C++ or Python peer-side API.

MBDyn is being actively developed and used in the aerospace (aircraft, helicopters, tiltrotors, spacecraft), wind energy (wind turbines), automotive (cars, trucks) and mechatronic fields (industrial robots, parallel robots, micro aerial vehicles (MAV)) for the analysis and simulation of the dynamics of complex systems.


Morse decompositions for piecewise constant vector fields.


MediaGoblin is a free software media publishing platform that anyone can run. You can think of it as a decentralized alternative to Flickr, YouTube, SoundCloud, etc.


MediaWiki is a free software open source wiki package written in PHP, originally for use on Wikipedia. It is now also used by several other projects of the non-profit Wikimedia Foundation and by many other wikis, including this website, the home of MediaWiki.

MediaWiki is an extremely powerful, scalable software and a feature-rich wiki implementation that uses PHP to process and display data stored in a database, such as MySQL. Pages use MediaWiki’s wikitext format, so that users without knowledge of XHTML or CSS can edit them easily. When a user submits an edit to a page, MediaWiki writes it to the database, but without deleting the previous versions of the page, thus allowing easy reverts in case of vandalism or spamming. MediaWiki can manage image and multimedia files, too, which are stored in the filesystem. For large wikis with lots of users, MediaWiki supports caching and can be easily coupled with Squid proxy server software.


The Collection extension for MediaWiki allows users to collect articles and generate downloadable version in different formats (PDF, OpenDocument Text etc.) for article collections and single articles.


Provides a library for parsing MediaWiki articles and converting them to different output formats. The collection extension is a MediaWiki extensions enabling users to collect articles and generate PDF files from those.


Kiwix is an offline reader for web content. It’s software intended to make Wikipedia available without using the internet, but it is potentially suitable for all HTML content. Kiwix supports the ZIM format, a highly compressed open format with additional meta-data. The features include a full text search engine, bookmarks and notes, an HTTP server, PDF/HTML export, a user interface in more than 10 languages, tabs navigation, and integrated content manager and downloader, etc.

See also OpenZIM.


A new innovative Python implementation harnessing Google’s super fast Dart Virtual Machine running Python at near native speeds.


A global optimization software tool that integrates two prominent population-based stochastic algorithms, namely Particle Swarm Optimization and Differential Evolution, with well established efficient local search procedures made available via the Merlin optimization environment. The resulting hybrid algorithms, also referred to as Memetic Algorithms, combine the space exploration advantage of their global part with the efficiency asset of the local search, and as expected they have displayed a highly efficient behavior in solving diverse optimization problems. The proposed software is carefully parametrized so as to offer complete control to fully exploit the algorithmic virtues. It is accompanied by comprehensive examples and a large set of widely used test functions, including tough atomic cluster and protein conformation problems.


Mercurium is a source-to-source compilation infrastructure aimed at fast prototyping. Current supported languages are C, C++. Mercurium is mainly used in Nanos environment to implement OpenMP but since it is quite extensible it has been used to implement other programming models or compiler transformations, examples include Cell Superscalar, Software Transactional Memory, Distributed Shared Memory or the ACOTES project, just to name a few.

Extending Mercurium is achieved using a plugin architecture, where plugins represent several phases of the compiler. These plugins are written in C++ and dynamically loaded by the compiler according to the chosen configuration. Code transformations are implemented in terms of source code (there is no need to modify or know the internal syntactic representation of the compiler).


DLB is a dynamic library designed to speed up hybrid applications by improving its load balance. DLB will redistribute the computational resources of the second level of parallelism to improve the load balance of the outer level of parallelism.


OmpSs is an effort to integrate features from the StarSs programming model developed by BSC into a single programming model. In particular, our objective is to extend OpenMP with new directives to support asynchronous parallelism and heterogeneity (devices like GPUs). However, it can also be understood as new directives extending other accelerator based APIs like CUDA or OpenCL. Our OmpSs environment is built on top of our Mercurium compiler and Nanos++ runtime system.


Nanos++ is a runtime designed to serve as runtime support in parallel environments. It is mainly used to support OmpSs, a extension to OpenMP developed at BSC. It also has modules to support OpenMP and Chapel.

Nanos++ provides services to support task parallelism using synchronizations based on data-dependencies. Data parallelism is also supported by means of services mapped on top of its task support. Task are implemented as user-level threads when possible (currently x86,x86-64,ia64,ppc32 and ppc64 are supported).

Nanos++ also provides support for maintaining coherence across different address spaces (such as with GPUs or cluster nodes). It provides software directory and cache modules to this end.


MeteoIO can be seen as a set of modules that is focused on the handling of input/output operations (including data preparation) for numerical simulations in the realm of earth sciences. On the visible side, it offers the following modules, working on a pre-determined set of meteorological parameters or on parameters added by the developer:

  • a set of plugins for accessing the data (for example, a plugin might be responsible for fetching the raw data from a given database)

  • a set of filters and processing elements for applying transformations to the data (for example, a filter might remove all data that is out of range)

  • a set of resampling algorithms to temporally interpolate the data at the required timestamp

  • a set of parametrizations to generate data/meteorological parameters when they could not be interpolated

  • a set of spatial interpolation algorithms (for example, such an algorithm might perform Inverse Distance Weighting for filling a grid with spatially interpolated data)

Each of these steps can be configured and fine tuned according to the needs of the model and the wishes of the user.


Meteor is an ultra-simple environment for building modern websites. What once took weeks, even with the best tools, now takes hours with Meteor.

The web was originally designed to work in the same way that mainframes worked in the 70s. The application server rendered a screen and sent it over the network to a dumb terminal. Whenever the user did anything, that server rerendered a whole new screen. This model served the Web well for over a decade. It gave rise to LAMP, Rails, Django, PHP.

But the best teams, with the biggest budgets and the longest schedules, now build applications in JavaScript that run on the client. These apps have stellar interfaces. They don’t reload pages. They are reactive: changes from any client immediately appear on everyone’s screen.

They’ve built them the hard way. Meteor makes it an order of magnitude simpler, and a lot more fun. You can build a complete application in a weekend, or a sufficiently caffeinated hackathon. No longer do you need to provision server resources, or deploy API endpoints in the cloud, or manage a database, or wrangle an ORM layer, or swap back and forth between JavaScript and Ruby, or broadcast data invalidations to clients.


The simulation and parameter optimization of coupled ocean circulation and ecosystem models in three space dimensions is one of the most challenging tasks in numerical climate research. Here we present a scientific toolkit that aims at supporting researchers by defining clear coupling interfaces, providing state-of-the-art numerical methods for simulation, parallelization and optimization while using only freely available and (to a great extend) platform-independent software. Besides defining a user-friendly coupling interface (API) for marine ecosystem or biogeochemical models, we heavily rely on the Portable, Extensible Toolkit for Scientific computation (PETSc) developed at Argonne Nat. Lab. for a wide variety of parallel linear and non-linear solvers and optimizers. We specifically focus on the usage of matrix-free Newton-Krylov methods for the fast computation of steady periodic solutions, and make use of the Transport Matrix Method (TMM).


A Lisp implemented in < 1 KB of JavaScript with macros, TCO, interop and exception handling.


A parallelized Python library for finding modal decompositions and reduced-order models. Parallel implementations of the proper orthogonal decomposition (POD), balanced POD (BPOD), dynamic mode decomposition (DMD), and Petrov-Galerkin projection are provided, as well as serial implementations of the Observer Kalman filter Identification method (OKID) and the Eigensystem Realization Algorithm (ERA). Modred is applicable to a wide range of problems and nearly any type of data.


A next generation web framework for the Perl programming language.


Mondrian is a general purpose statistical data-visualization system. It features outstanding interactive visualization techniques for data of almost any kind, and has particular strengths, compared to other tools, for working with Categorical Data, Geographical Data and LARGE Data.

All plots in Mondrian are fully linked, and offer many interactions and queries. Any case selected in a plot in Mondrian is highlighted in all other plots.

Currently implemented plots comprise Histograms, Boxplots y by x, Scatterplots, Barcharts, Mosaicplots, Missing Value Plots, Parallel Coordinates/Boxplots, SPLOMs and Maps.

Mondrian works with data in standard tab-delimited or comma-separated ASCII files and can load data from R workspaces. There is basic support for working directly on data in Databases (please email for further info).

Mondrian is written in JAVA and is distributed as a native application (wrapper) for MacOS X and Windows. Linux users need to start the jar-file.


MongoDB (from humongous) is a cross-platform document-oriented database. Classified as a NoSQL database, MongoDB eschews the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas (MongoDB calls the format BSON), making the integration of data in certain types of applications easier and faster.


D4 is an automated tool for a generating distributed document database designs for applications running on MongoDB. This tool specifically targets applications running highly concurrent workloads, and thus its designs are tailored to the unique properties of large-scale, Web-based applications. It can also be used to assist in porting MySQL-based applications to MongoDB.

Using a sample workload trace from a either a document-oriented or relational database application, D4 will compute the best a database design that optimizes the throughput and latency of a document DBMS.


Moodle is a free, online Learning Management system enabling educators to create their own private website filled with dynamic courses that extend learning, any time, anywhere. Whether you’re a teacher, student or administrator, Moodle can meet your needs. Moodle’s extremely customisable core comes with many standard features.


Mopidy is an extensible music server written in Python.

Mopidy plays music from local disk, Spotify, SoundCloud, Google Play Music, and more. You edit the playlist from any phone, tablet, or computer using a range of MPD and web clients.


MORSE is an generic simulator for academic robotics. It focuses on realistic 3D simulation of small to large environments, indoor or outdoor, with one to tenths of autonomous robots.

MORSE can be entirely controlled from the command-line. Simulation scenes are generated from simple Python scripts.

MORSE comes with a set of standard sensors (cameras, laser scanner, GPS, odometry,…​), actuators (speed controllers, high-level waypoints controllers, generic joint controllers) and robotic bases (quadrotors, ATRV, Pioneer3DX, generic 4 wheel vehicle, PR2,…​). New ones can easily be added.

MORSE rendering is based on the Blender Game Engine. The OpenGL-based Game Engine supports shaders, provides advanced lightning options, supports multi-texturing, and use the state-of-the-art Bullet library for physics simulation.


MoviePy is a Python module for video editing, which can be used for basic operations (like cuts, concatenations, title insertions), video compositing (a.k.a. non-linear editing), video processing, or to create advanced effects. It can read and write the most common video formats, including GIF.


Comparison among OOP versions of an MPDATA code written using Python, Fortran and C++.




MPICH-Madeleine is a free MPICH-based implementation of the MPI standard, which is a high-level communication interface designed to provide high performance communications on various network architectures including supercomputers and clusters of workstations (usually off-the-shelf PC/s interconnected by high speed links). Nowadays, clusters of workstations become increasingly popular thanks to the availability of many high speed connection technologies (Gigabit-Ethernet, Myrinet, GigaNet, SCI). Furthermore, interconnecting such COW/s to build heterogeneous clusters of clusters is now a hot issue. Unfortunately, no current MPI implementation supports this kind of architectures efficiently. Indeed, the only way to handle network heterogeneity is to use interoperable implementations of MPI: several MPI implementations (one per cluster) communicate with each other using an inter-MPI glue.

Our alternate proposal is to provide a true multi-protocol implementation of MPI on top of a generic and multi-protocol communication layer called Madeleine (version 3). Madeleine III is the communication sub-system of the Parallel Multithreaded Machine runtime environment.

This project is deprecated in favor of NewMadeleine.


NewMadeleine is the fourth incarnation of the Madeleine communication library. The new architecture aims at enabling the use of a much wider range of communication flow optimization techniques. Its design is entirely modular: drivers and optimization strategies are dynamically loadable software components, allowing experimentations with multiple approaches or on multiple issues with regard to processing communication flows.

The optimizing scheduler SchedOpt targets applications with irregular, multi-flow communication schemes such as found in the increasingly common application conglomerates made of multiple programming environments and coupled pieces of code, for instance. SchedOpt itself is easily extensible through the concepts of optimization strategies (what to optimize for, what the optimization goal is) expressed in terms of tactics (how to optimize to reach the optimization goal). Tactics themselves are made of basic communication flows operations such as packet merging or reordering.

The communication library is fully multi-threaded through its close integration with PIOMan. It manages concurrent communication operations from multiple libraries and from multiple threads. Its MPI implementation Mad-MPI fully supports the MPI_THREAD_MULTIPLE multi-threading level. It is available on Infiniband (ibverbs), Myrinet (MX and GM), TCP (sockets) and legacy SCI and Quadrics QsNet-2.

Open MPI

The Open MPI Project is an open source Message Passing Interface implementation that is developed and maintained by a consortium of academic, research, and industry partners. Open MPI is therefore able to combine the expertise, technologies, and resources from all across the High Performance Computing community in order to build the best MPI library available. Open MPI offers advantages for system and software vendors, application developers and computer science researchers.


The Open Resilient Cluster Manager (ORCM) was originally developed as an open-source project (under the Open MPI license) by Cisco Systems, Inc to provide a resilient, 100% uptime run-time environment for enterprise-class routers. Based on the Open Run-Time Environment (ORTE) embedded in Open MPI, the system provided launch and execution support for processes executing within the router itself (e.g., computing routing tables), ensuring that a minimum number of copies of each program were always present.

ORCM (Open Resilient Cluster Manager) is a derivative from Open MPI implementation. It consists of:

  • ORCM: Open Resilient Cluster Manager. Provides the following: Resource Management, Scheduler, Job launcher and Resource Monitoring subsystem.

  • ORTE: The Open Run-Time Environment (support for different back-end run-time systems). Provides the RM messaging interface, RM error management subsystem, RM routing subsystem and RM resource allocation subsystem.

  • OPAL: The Open Portable Access Layer (utility and "glue" code used by ORCM and ORTE). Provides operating system interfaces.


The mpld3 project brings together Matplotlib, the popular Python-based graphing library, and D3js, the popular Javascript library for creating interactive data visualizations for the web. The result is a simple API for exporting your matplotlib graphics to HTML code which can be used within the browser, within standard web pages, blogs, or tools such as the IPython notebook.


In a first course to classical mechanics elementary physical processes like elastic two-body collisions, the mass–spring model, or the gravitational two-body problem are discussed in detail. The continuation to many-body systems, however, is deferred to graduate courses although the underlying equations of motion are essentially the same and although there is a strong motivation for high-school students in particular because of the use of particle systems in computer games. The missing link between the simple and the more complex problem is a basic introduction to solve the equations of motion numerically which could be illustrated, however, by means of the Euler method. The many-particle physics simulation package MPPhys offers a platform to experiment with simple particle simulations. The aim is to give a principle idea how to implement many-particle simulations and how simulation and visualization can be combined for interactive visual explorations.


MPWide is a light-weight communication library for distributed computing. It is specifically developed to allow message passing over long-distance networks using path-specific optimizations.


Morse-Smale Complex Extraction, Exploration, and Reasoning is a set of tools and libraries for feature extraction and exploration in scalar fields. MSCEER computes a gradient-based abstract representation of a scalar field.


The mscomplex3d consists of two modules for computation and analysis of Morse-Smale complexes on 3d grids. The first is a command line exec named mscomplex3d. The second is a python loadable module named pyms3d. The Morse-Smale complex is a topological data structure that partitions datasets based on the gradients of an input scalar function. See here for a quick introduction on Morse-Smale complexes. This website presents software that computes the Morse-Smale complex of scalar functions defined on 3D Structured Grids and 2D triangle meshes.


MTT comprises a set of tools for modelling dynamic physical systems using the bond-graph methodology and transforming these models into representations suitable for analysis, control and simulation.


The Multiscale Coupling Library and Environment is a portable framework to do multiscale modeling and simulation on distributed computing resources. The generic coupling mechanism of MUSCLE is suitable for many types of multiscale applications, notably for multiscale models as defined by the MAPPER project or complex automata as defined in the COAST project. Submodels can be implemented from scratch, but legacy code can also be used with only minor adjustments. The runtime environment solves common problems in distributed computing and couples submodels of a multiscale model, whether they are built for high-performance supercomputers or for local execution. MUSCLE supports Java, C, C++, Fortran, Python, MATLAB and Scala code, using MPI, OpenMP, or threads.


Copies between local file systems are a daily activity. Files are constantly being moved to locations accessible by systems with different functions and/or storage limits, being backed up and restored, or being moved due to upgraded and/or replaced hardware. Hence, maximizing the performance of copies as well as checksums that ensure the integrity of copies is desirable to minimize the turnaround time of user and administrator activities. Modern parallel file systems provide very high performance for such operations using a variety of techniques such as striping files across multiple disks to increase aggregate I/O bandwidth and spreading disks across multiple servers to increase aggregate interconnect bandwidth.

To achieve peak performance from such systems, it is typically necessary to utilize multiple concurrent readers/writers from multiple systems to overcome various single-system limitations such as number of processors and network bandwidth. The standard cp and md5sum tools of GNU coreutils found on every modern Unix/Linux system, however, utilize a single execution thread on a single CPU core of a single system, hence cannot take full advantage of the increased performance of clustered file systems.

Mutil provides mcp and msum, which are drop-in replacements for cp and md5sum that utilize multiple types of parallelism to achieve maximum copy and checksum performance on clustered file systems. Multi-threading is used to ensure that nodes are kept as busy as possible. Read/write parallelism allows individual operations of a single copy to be overlapped using asynchronous I/O. Multi-node cooperation allows different nodes to take part in the same copy/checksum. Split file processing allows multiple threads to operate concurrently on the same file. Finally, hash trees allow inherently serial checksums to be performed in parallel.


The Muster library provides implementations of serial and parallel K-Medoids clustering algorithms. It is intended as a general framework for parallel cluster analysis, particularly for performance data analysis on systems with very large numbers of processes.

The parallel implementations in the Muster are designed to perform well even in environments where the data to be clustered is entirely distributed. For example, many performance tools need to analyze one data element from each process in a system. To analyze this data efficiently, clustering algorithms that move as little data as possible are required. In Muster, we exploit sampled clustering algorithms to realize this efficiency.

The parallel algorithms in Muster are implemented using the Message Passing Interface (MPI), making them suitable for use on many of the world’s largest supercomputers. They should, however, also run efficiently on your laptop.


Parallel wavelet compression.

National Data Service

The National Data Service is an emerging vision of how scientists and researchers across all disciplines can find, reuse, and publish data. It is an international federation of data providers, data aggregators, community-specific federations, publishers, and cyberinfrastructure providers. It builds on the data archiving and sharing efforts under way within specific communities and links them together with a common set of tools.


Navigation and estimation tools written in Python.


Parallel computation is widely employed in scientific researches, engineering activities and product development. Parallel program writing itself is not always a simple task depending on problems solved. Large-scale scientific computing, huge data analyses and precise visualizations, for example, would require parallel computations, and the parallel computing needs the parallelization techniques. In this Chapter a parallel program generation support is discussed, and a computer-assisted parallel program generation system P-NCAS is introduced. Computer assisted problem solving is one of key methods to promote innovations in science and engineering, and contributes to enrich our society and our life toward a programming-free environment in computing science. Problem solving environments (PSE) research activities had started to enhance the programming power in 1970’s. The P-NCAS is one of the PSEs; The PSE concept provides an integrated human-friendly computational software and hardware system to solve a target class of problems.


NcSOS adds an OGC SOS service to datasets in your existing THREDDS server. It complies with the IOOS SWE Milestone 1.0 templates and requires your datasets be in any of the CF 1.6 Discrete Sampling Geometries.

NcSOS acts like other THREDDS services (such an OPeNDAP and WMS) where as there are individual service endpoints for each dataset. It is best to aggregate your files and enable the NcSOS service on top of the aggregation. i.e. The NcML aggregate of hourly files from an individual station would be a good candidate to serve with NcSOS. Serving the individual hourly files with NcSOS would not be as beneficial.

You will need a working THREDDS installation of a least version 4.3.16 to run NcSOS.


The numerical differentiation library (NDL) used for the numerical estimation of first and second order partial derivatives of a function by finite differencing. In this version we have restructured the serial implementation of the code so as to achieve optimal task-based parallelization. The pure shared-memory parallelization of the library has been based on the lightweight OpenMP tasking model allowing for the full extraction of the available parallelism and efficient scheduling of multiple concurrent library calls. On multicore clusters, parallelism is exploited by means of TORC, an MPI-based multi-threaded tasking library. The new MPI implementation of NDL provides optimal performance in terms of function calls and, furthermore, supports asynchronous execution of multiple library calls within legacy MPI programs. In addition, a Python interface has been implemented for all cases, exporting the functionality of our library to sequential Python codes.


Neko is an high-level dynamicly typed programming language. It can be used as an embedded scripting language. It has been designed to provide a common runtime for several different languages. Learning and using Neko is very easy. You can easily extend the language with C libraries. You can also write generators from your own language to Neko and then use the Neko Runtime to compile, run, and access existing libraries.


Neo is minimal and fast Go Web Framework with extremely simple API.

During development you will enjoy in automatic reruning and recompiling your Neo application when source changes.

Build your Neo Application in few lines of code.


An open-source NoSQL graph database implemented in Java and Scala. With development starting in 2003, it has been publicly available since 2007. The source code and issue tracking are available on GitHub, with support readily available on Stack Overflow and the Neo4j Google group.

Neo4j implements the Property Graph Model efficiently down to the storage level. As opposed to graph processing or in-memory libraries, Neo4j provides full database characteristics including ACID transaction compliance, cluster support, and runtime failover, making it suitable to use graph data in production scenarios.

Neo4j’s free and open-source Community edition is a high-performance, fully ACID-transactional database. The Community edition includes (but is not limited to) all the functionality described previously in this section.


GraphGists are an easy way to create and share documents containing not just prose, structure and pictures but most importantly example graph models and use-cases expressed in Neo4j’s query language Cypher. These documents are written in AsciiDoc — the simple, textual markup language — and rendered in your browser as rich and interactive web pages that you can quickly evolve from describing simple howtos or questions to providing an extensive use-case specification.


The NESToolbox is a collection of algorithms to perform similarity estimation for irregularly sampled time series as they arise for example in the geosciences. It is implemented as a toolbox for the widely used software MATLAB and the freely available open-source software OCTAVE.

The installation of the Python portation is simple: just copy the in your working directory.




A simple Fortran 90 interface to NetCDF reading and writing.

With over 120,000 modules in Node’s package repository, it is easy to extend the range of palette nodes to add new capabilities.


Running IPython notebook servers on Amazon’s EC2.


We introduce the very first NPU compilation workflow, called NPiler, which automatically converts annotated regions of imperative code to a neural network representation. First, the programmer annotates the regions of imperative code which he/she wants to transform to a neural representation. NPiler accepts inputs from the programmer to train the network. During this step, NPiler automatically observes the input and output pairs to the annotated regions to collect training and testing data. Then, NPiler trains each possible NPU topology given constraints provide by the programmer. The outcome of this exploration provides the best possible NPU topology in terms of minimum root mean square error (RMSE) on test data. Finally, our compiler replaces the annotated regions with the final neural network representation. We use FANN library to execute the neural network representation. We released NPiler with seven representative benchmarks from diverse domains to evaluate our NPU compilation workflow.


Numba gives you the power to speed up your applications with high performance functions written directly in Python. With a few annotations, array-oriented and math-heavy Python code can be just-in-time compiled to native machine instructions, similar in performance to C, C++ and Fortran, without having to switch languages or Python interpreters.


CPUs. Oasis is an open source finite element Navier-Stokes solver written from scratch in Python using building blocks from FEniCS. The solver is unstructured, runs with MPI and interfaces, through FEniCS, to the state-of-the-art linear algebra backend PETSc. Oasis advocates a high-level, programmable Python user interface, where the user is placed in complete control of every aspect of the solver.

There are currently two solvers implemented, one for steady-state and one for transient flows. The transient solver uses the fractional step algorithm for any finite element discretization of the actual Navier Stokes equations. The steady-state solver is coupled using a mixed space for velocity and pressure.


OCaml is the main implementation of the Caml programming language, created by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy and others in 1996. OCaml extends the core Caml language with object-oriented constructs.

OCaml’s toolset includes an interactive top level interpreter, a bytecode compiler, and an optimizing native code compiler. It has a large standard library that makes it useful for many of the same applications as Python or Perl, as well as robust modular and object-oriented programming constructs that make it applicable for large-scale software engineering. OCaml is the successor to Caml Light. The acronym CAML originally stood for Categorical Abstract Machine Language, although OCaml abandons this abstract machine.[1]

OCaml is a free open source project managed and principally maintained by INRIA. In recent years, many new languages have drawn elements from OCaml, most notably Fsharp and Scala.


The inability to predict lasting languages and architectures led us to develop OCCA, a C++ library focused on host-device interaction. Using run-time compilation and macro expansions, the result is a novel single kernel language that expands to multiple threading languages. Currently, OCCA supports device kernel expansions for the OpenMP, OpenCL, and CUDA platforms. Computational results using finite difference, spectral element and discontinuous Galerkin methods show OCCA delivers portable high performance in different architectures and platforms.

High-order finite-difference methods are commonly used in wave propagators for industrial subsurface imaging algorithms. Computational aspects of the reduced linear elastic vertical transversely isotropic propagator are considered. Thread parallel algorithms suitable for implementing this propagator on multi-core and many-core processing devices are introduced. Portability is addressed through the use of the OCCA runtime programming interface. Finally, performance results are shown for various architectures on a representative synthetic test case.


The Open Cloud Computing Interface comprises a set of open community-lead specifications delivered through the Open Grid Forum. OCCI is a Protocol and API for all kinds of Management tasks. OCCI was originally initiated to create a remote management API for IaaS model based Services, allowing for the development of interoperable tools for common tasks including deployment, autonomic scaling and monitoring. It has since evolved into a flexible API with a strong focus on integration, portability, interoperability and innovation while still offering a high degree of extensibility. The current release of the Open Cloud Computing Interface is suitable to serve many other models in addition to IaaS, including e.g. PaaS and SaaS.


OpenCL implementations are provided as ICD (Installable Client Driver). An OpenCL program can use several ICD thanks to the use of an ICD Loader as provided by this project. This free ICD Loader can load any (free or non free) ICD.

This package aims at creating an Open Source alternative to vendor specific OpenCL ICD loaders. The main difficulties to create such software is that the order of function pointers in a structure is not publicy available. This software maintains a YAML database of all known and guessed entries. This package also delivers a skeleton of bindings to incorporate inside an OpenCL implementation to give it ICD functionalities.


The Open Community Runtime project is creating an application building framework that explores new methods of high-core-count programming. The initial focus is on HPC applications. Its goal is to create a tool that helps app developers improve the power efficiency, programmability, and reliability of their work while maintaining app performance.

OCR will help the app developer with the complex process of writing multi-core apps create by masking the effort to manage event-driven tasks, events (which embody dataflow and code flow dependencies), memory data blocks (with semantic annotations for runtime use), machine description facilities, and more.

This is a large open source project distributed under the GPL-2.0+ open source license. With a mature and established codebase containing almost 8 million lines of code, Linux ACPI is written largely in C. OCR was originally unveiled at Supercomputing Conference 2012 (SC12) with a major new release (v0.8) introduced at Supercomputing 2013 (SC13). Community participation is encouraged, both for runtime enhancement as well as exploration of algorithm/application decomposition for new programming models.


The octavemagic extension provides the ability to interact with Octave. It is provided by the oct2py package, which may be installed using pip or easy_install.


OData (Open Data Protocol) is an OASIS standard that defines the best practice for building and consuming RESTful APIs. OData helps you focus on your business logic while building RESTful APIs without having to worry about the approaches to define request and response headers, status codes, HTTP methods, URL conventions, media types, payload formats and query options etc. OData also guides you about tracking changes, defining functions/actions for reusable procedures and sending asynchronous/batch requests etc. Additionally, OData provides facility for extension to fulfil any custom needs of your RESTful APIs. OData RESTful APIs are easy to consume. The OData metadata, a machine-readable description of the data model of the APIs, enables the creation of powerful generic client proxies and tools. Some of them can help you interact with OData even without knowing anything about the protocol.


ODataPy is an open-source Python library that implements the Open Data Protocol (OData). It supports the OData protocol version 4.0. It is built on top of ODataCpp using language binding. It is under development and currently serves only parts of client and client side proxy generation (code gen) aspects of OData.


Odata Server with support for MySQL and for BLOBs managed by Leveled.


The OpenFabrics Enterprise Distribution (OFED™) is open-source software for RDMA and kernel bypass applications. OFED is used in business, research and scientific environments that require highly efficient networks, storage connectivity and parallel computing. The software provides high performance computing sites and enterprise data centers with flexibility and investment protection as computing evolves towards applications that require extreme speeds, massive scalability and utility-class reliability.

OFED includes kernel-level drivers, channel-oriented RDMA and send/receive operations, kernel bypasses of the operating system, both kernel and user-level application programming interface (API) and services for parallel message passing (MPI), sockets data exchange (e.g., RDS, SDP), NAS and SAN storage (e.g. iSER, NFS-RDMA, SRP) and file system/database systems.

The network and fabric technologies that provide RDMA performance with OFED include: legacy 10 Gigabit Ethernet, iWARP for Ethernet, RDMA over Converged Ethernet (RoCE), and 10/20/40 Gigabit InfiniBand.


OFF, an open source (free software) code for performing fluid dynamics simulations, is presented. The aim of OFF is to solve, numerically, the unsteady (and steady) compressible Navier–Stokes equations of fluid dynamics by means of finite volume techniques: the research background is mainly focused on high-order (WENO) schemes for multi-fluids, multi-phase flows over complex geometries. To this purpose a highly modular, object-oriented application program interface (API) has been developed. In particular, the concepts of data encapsulation and inheritance available within Fortran language (from standard 2003) have been stressed in order to represent each fluid dynamics “entity” (e.g. the conservative variables of a finite volume, its geometry, etc…) by a single object so that a large variety of computational libraries can be easily (and efficiently) developed upon these objects.


See also Leaflet and MapFish.


A reference implementation of the OGC Sensor Observation Service specification (version 2.0).


The CEDA OGC Web Services framework (COWS) is a Python software framework developed at the Centre of Environmental Data Archival for implementing Open Geospacial Consortium web service standards.


GeoJModelBuilder couples geosprocessing Web services, NASA World Wind and Sensor Web services to support geoprocessing modeling and environmental monitoring.The main goal of GeoJModelBuilder is to bring an easy-to-use tool to the geoscientific community.

The tool can allow users to drag and drop various geospatial services to visually generate workflows and interact with the workflows in a virtual globe environment. It also allows users to audit trails of workflow executions, check the provenance of data products, and support scientific reproducibility.

The programming language used for the development is Java due to its platform-independent feature. The tool can be operated on any operating systems such as Windows or Unix/Linux that supports Java.


A catalog application to manage spatially referenced resources. It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer. GeoNetwork has been developed to connect spatial information communities and their data using a modern architecture, which is at the same time powerful and low cost, based on the principles of Free and Open Source Software (FOSS) and International and Open Standards for services and protocols (a.o. from ISO/TC211 and OGC).

The software provides an easy to use web interface to search geospatial data across multiple catalogs, combine distributed map services in the embedded map viewer, publish geospatial data using the online metadata editing tools and optionally the embedded GeoServer map server.

You will find support for a range of standards. Metadata standards (ISO19115/ISO19119/ISO19110 following ISO19139, FGDC and Dublin Core), Catalog interfaces (OGC-CSW2.0.2 ISO profile client and server, OAI-PMH client and server, GeoRSS server, GEO OpenSearch server, WebDAV harvesting, GeoNetwork to GeoNetwork harvesting support) and Map Services interfaces (OGC-WMS, WFS, WCS, KML and others) through the embedded GeoServer map server.


GeoNode is a web-based application and platform for developing geospatial information systems (GIS) and for deploying spatial data infrastructures (SDI). Data management tools built into GeoNode allow for integrated creation of data, metadata, and map visualizations. Each dataset in the system can be shared publicly or restricted to allow access to only specific users. Social features like user profiles and commenting and rating systems allow for the development of communities around each platform to facilitate the use, management, and quality control of the data the GeoNode instance contains.


GeoPackage is the modern alternative to formats like SDTS and Shapefile. At its core, GeoPackage is simply a SQLite database schema. If you know SQLite, you are close to knowing GeoPackage. Install Spatialite – the premiere spatial extention to SQLite – and you get all the performance of a spatial database along with the convenience of a file-based data set that can be emailed, shared on a USB drive or burned to a DVD.

GeoPackage was carefully designed this way to facilitate widespread adoption and use of a single simple file format by both commercial and open-source software applications — on enterprise production platforms as well as mobile hand-held devices. GeoPackage is a standard from the Open Geospatial Consortium. It was designed and prototyped following a multi-year, open process of requirements testing and public input. It is designed for extension. So if you need more than the core GeoPackage feature set, join OGC’s open process to standardize community-tested enhancements.


GeoWebCache is a Java web application used to cache map tiles coming from a variety of sources such as OGC Web Map Service (WMS). It implements various service interfaces (such as WMS-C, WMTS, TMS, Google Maps KML, Virtual Earth) in order to accelerate and optimize map image delivery. It can also recombine tiles to work with regular WMS clients.


An OGC SOS server implementation written in Python. istSOS allows for managing and dispatch observations from monitoring sensors according to the Sensor Observation Service standard. The project provides also a Graphical user Interface that allows for easing the daily operations and a RESTful Web api for automatizing administration procedures.


Compliant WMS server written in Python and using Mapnik C++ library.


OWSLib is a Python package for client programming with Open Geospatial Consortium (OGC) web service (hence OWS) interface standards, and their related content models.


An OGC CSW server implementation written in Python that allows for the publishing and discovery of geospatial metadata, providing a standards-based metadata and catalogue component of spatial data infrastructures.


Python library for collecting Met/Ocean observations. Pyoos attempts to fill the need for a high level data collection library for met/ocean data publically available through many different websites and webservices.

Pyoos will collect and parse the following data services into the Paegan Discrete Geometry CDM:

  • IOOS SWE SOS 1.0 Services

    • ex. NcSOS instance:

    • ex. IOOS 52N instance:

  • NERRS Observations - SOAP

  • NDBC Observations - SOS

  • CO-OPS Observations - SOS

  • STORET Water Quality - WqxOutbound via REST (

  • USGS NWIS Water Quality - WqxOutbound via REST (

  • USGS Instantaneous Values - WaterML via REST

  • NWS AWC Observations - XML via REST (

  • HADS ( - limited to 7 day rolling window of data)


PySOS, a python-based implementation of the OGC SOS standard. PySOS is a lightweight set of scripts that work in conjunction with a web server to serve data from a relational database.


The OGC Styled Layer Descriptor (SLD) profile of the WMS standard defines encoding that extends the WMS standard to allow user-defined symbolization and coloring of geographic feature and coverage data. It addresses the need for users and software to be able to control the visual portrayal of the geospatial data. The ability to define styling rules requires a styling language that the client and server can both understand.

This is a Python library for reading, writing, and manipulating SLD files.


An implementation of the Web processing Service standard from Open Geospatial Consortium. PyWPS offers an environment for programming own processes (geofunctions or models) which can be accessed from the public. The main advantage of PyWPS is, that it has been written with native support for GRASS GIS. Access to GRASS modules via web interface should be as easy as possible.


A client for Sensor Observation Services (SOS) as specified by the Open Geospatial Consortium (OGC). It allows users to retrieve metadata from SOS web services and to interactively create requests for near real-time observation data based on the available sensors, phenomena, observations et cetera using thematic, temporal and spatial filtering.


SOS.js is a JavaScript library to browse, visualise, and access, data from an Open Geospatial Consortium (OGC) Sensor Observation Service (SOS). The library consists of a number of modules, which along with their dependencies build a layered abstraction for communicating with a SOS.

The core module - SOS.js, contains a number of objects that encapsulate core concepts of SOS, such as managing the service connection parameters, the service’s capabilities document, methods to access the service’s Features of Interest (FOIs), offerings, observed properties etc. It also contains various utility functions, available as methods of the SOS.Utils object. This module is built on top of OpenLayers, for low-level SOS request/response handling.

The user interface module - SOS.Ui.js, contains the UI components of the library. These components can be used standalone, but are also brought together in the default SOS.App object as a (somewhat) generic web application. This module is built on top of OpenLayers which provides simple mapping for discovery; jQuery for the UI and plumbing; and flot, which is a jQuery plugin, for the plotting.


Sensor Observation Service (SOS) and data management.


A library extending the basic SQLite core in order to get a full fledged Spatial DBMS, really simple and lightweight, but mostly OGC-SFS compliant.


A collection of open source Command Line Interface (CLI) tools supporting SpatiaLite.


The web-based Earth Observation Monitor (webEOM) provides easy access and visualization for spatial time-series data. It is based on a spatial data infrastructure containing a Metadata catalogue, visualization and download services as well as processing services. These services are compliant to specifications of the Open Geospatial Consortium (OGC).

webEOM is designed for an easy usage. Time-series plots can be generated within a few clicks without data processing needs by the user. Further developments are planned for 2014, e.g. users will have the possibilities to generate further plots specified by individual parameters and users can specifiy monitoring parameters for individual areas and datasets.


Data download from multiple data providers as well as data integration and provision with OGC compliant web services are implemented in Python programming language.


The OmicABEL (pronounced as "amicable") package allows rapid mixed-model based genome-wide association analysis; it efficiently handles large datasets, and both single trait and multiple trait ("omics") analysis.

CLAK-GWAS is a software for performing Genome-Wide Association Studies (GWAS). It provides a high-performance implementation of two algorithms, CLAK-Chol and CLAK-Eig, for GWAS involving single and multiple phenotypes, respectively.


Omni compiler is a collection of programs and libraries that allow users to build code transformation compilers. Omni Compiler is to translate C and Fortran programs with XcalableMP and/or OpenACC directives into parallel code suitable for compiling with a native compiler linked with the Omni Compiler runtime library.

Omni compiler consists of following components:

  • XcalableMP - XcalableMP is a directive-based language extension of C and Fortran for distributed memory systems. XcalableMP allows users to develop a parallel application and to tune its performance with minimal and simple notation.

  • OpenACC - OpenACC is a directive-based programming interface for accelerators such as GPGPU. OpenACC allows users to express the offloading of data and computations to accelerators to simplify the porting process for legacy CPU-based applications.

  • XcodeML - XcodeML is an intermediate code written in XML for C and Fortran languages.


OMP2HMPP a tool that, automatically translates a high-level C source code(OpenMP) code into HMPP. The generated version rarely will differs from a hand-coded HMPP version, and will provide an important speedup, near 113%, that could be later improved by hand-coded CUDA.


OMP2MPI automatically generates MPI source code from OpenMP. Allowing that the program exploits non shared-memory architectures such as cluster, or Network-on-Chip based(NoC-based) Multiprocessors-System-onChip (MPSoC). OMP2MPI gives a solution that allow further optimization by an expert that want to achieve better results. Tested set of problems obtains in most of cases with more than 20x of speedup for 64 cores compared to the sequential version and an average speedup over 4x compared to OpenMP.


OmpSs is an effort to integrate features from the StarSs programming model developed by BSC into a single programming model. In particular, our objective is to extend OpenMP with new directives to support asynchronous parallelism and heterogeneity (devices like GPUs). However, it can also be understood as new directives extending other accelerator based APIs like CUDA or OpenCL. Our OmpSs environment is built on top of our Mercurium compiler and Nanos++ runtime system.

Asynchronous parallelism is enabled in OmpSs by the use data-dependencies between the different tasks of the program. To support heterogeneity a new construct is introduced: the target construct.


A runtime designed to serve as runtime support in parallel environments. It is mainly used to support OmpSs, a extension to OpenMP developed at BSC. It also has modules to support OpenMP and Chapel. Nanospp provides services to support task parallelism using synchronizations based on data-dependencies. Data parallelism is also supported by means of services mapped on top of its task support. Task are implemented as user-level threads when possible. It also provides support for maintaining coherence across different address spaces (such as with GPUs or cluster nodes). It provides software directory and cache modules to this end.


Mercurium is a source-to-source compilation infrastructure aimed at fast prototyping. Current supported languages are C, C++. Mercurium is mainly used in Nanos environment to implement OpenMP but since it is quite extensible it has been used to implement other programming models or compiler transformations, examples include Cell Superscalar, Software Transactional Memory, Distributed Shared Memory or the ACOTES project, just to name a few.

Extending Mercurium is achieved using a plugin architecture, where plugins represent several phases of the compiler. These plugins are written in C++ and dynamically loaded by the compiler according to the chosen configuration. Code transformations are implemented in terms of source code (there is no need to modify or know the internal syntactic representation of the compiler).


An open-source framework for the execution of unstructured grid applications on clusters of GPUs or multi-core CPUs. P2 is an API with associated libraries and preprocessors to generate parallel executables for applications on unstructured grids.

The OP2 project is developing an open-source framework for the execution of unstructured grid applications on clusters of GPUs or multi-core CPUs. Although OP2 is designed to look like a conventional library, the implementation uses source-source translation to generate the appropriate back-end code for the different target platforms.


The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator. OpenACC is designed for portability across operating systems, host CPUs, and a wide range of accelerators, including APUs, GPUs, and many-core coprocessors. The directives and programming model defined in the OpenACC API document allow programmers to create high-level host+accelerator programs without the need to explicitly initialize the accelerator, manage data or program transfers between the host and accelerator, or initiate accelerator startup and shutdown.

All of these details are implicit in the programming model and are managed by the OpenACC API-enabled compilers and runtimes. The programming model allows the programmer to augment information available to the compilers, including specification of data local to an accelerator, guidance on mapping of loops onto an accelerator, and similar performance-related details.

OpenACC in GCC

This page contains information on GCC’s implementation of the OpenACC specification and related functionality.

KernelGen (LLVM)

A prototype of auto-parallelizing Fortran/C compiler for NVIDIA GPUs, targeting numerical modelling code.


OpenAlea is an open source project primarily aimed at the plant research community. It is a distributed collaborative effort to develop Python libraries and tools that address the needs of current and future works in Plant Architecture modeling. OpenAlea includes modules to analyse, visualize and model the functioning and growth of plant architecture.


OpenARC is a new, open source compiler framework, which provides extensible environment, where various performance optimizations, traceability mechanisms, fault tolerance techniques, etc., can be built for better debuggability/performance/resilience on the complex accelerator computing.


Open CASCADE Technology is a software development kit (SDK) intended for development of applications dealing with 3D CAD data, freely available in open source. It includes a set of C++ class libraries providing services for 3D surface and solid modeling, visualization, data exchange and rapid application development.


OpenCL™ is the first open, royalty-free standard for cross-platform, parallel programming of modern processors found in personal computers, servers and handheld/embedded devices. OpenCL (Open Computing Language) greatly improves speed and responsiveness for a wide spectrum of applications in numerous market categories from gaming and entertainment to scientific and medical software.

Intel OpenCL

Intel® Code Builder for OpenCL™ API is a comprehensive environment for OpenCL software development on Intel Architecture processors and Intel Xeon Phi™ coprocessors. The Code Builder comprises the Intel implementation of the OpenCL standard and a set of tools for OpenCL application development on Linux* operating systems.


Portable Computing Language (pocl) aims to become a MIT-licensed open source implementation of the OpenCL standard which can be easily adapted for new targets and devices, both for homogeneous CPU and heterogenous GPUs/accelerators.

pocl uses Clang as an OpenCL C frontend and LLVM for the kernel compiler implementation, and as a portability layer. Thus, if your desired target has an LLVM backend, it should be able to get OpenCL support easily by using pocl.

The goal is to accomplish improved performance portability using a kernel compiler that can generate multi-work-item work-group functions that exploit various types of parallel hardware resources: VLIW, superscalar, SIMD, SIMT, multicore, multithread …​

Additional purpose of the project is to serve as a research platform for issues in parallel programming on heterogeneous platforms.


OpenClimateGIS (OCGIS) is a Python package designed for geospatial manipulation, subsetting, computation, and translation of climate datasets stored in local NetCDF files or files served through THREDDS data servers. OpenClimateGIS has a straightforward, request-based API that is simple to use yet complex enough to perform a variety of computational tasks. The software is built entirely from open source packages. ClimateTranslator is a new web interface to the OpenClimateGIS functionality.


OpenCMISS libraries and applications provide the foundation for developing computational modelling and visualisation software, particularly targeting bioengineering.


OpenCores is an open source hardware community developing digital open source hardware through electronic design automation, with a similar ethos to the free software movement. OpenCores hopes to eliminate redundant design work and slash development costs.


OPeNDAP stands for "Open-source Project for a Network Data Access Protocol" OPeNDAP is both the name of a non-profit organization and the commonly-used name of a protocol which the OPeNDAP organization has developed. The DAP2 protocol provides a discipline-neutral means of requesting and providing data across the World Wide Web. The goal is to allow end users, whoever they may be, to access immediately whatever data they require in a form they can use, all while using applications they already possess and are familiar with. In the field of oceanography, OPeNDAP has already helped the research community make significant progress towards this end. Ultimately, it is hoped, OPeNDAP will be a fundamental component of systems which provide machine-to-machine interoperability with semantic meaning in a highly distributed environment of heterogeneous datasets. The OPeNDAP organization exists to develop, implement, and promulgate the OPeNDAP protocol. It presents the results of its work freely to the public with the hope that it will be of service in many disciplines and facilitate sharing of and access to their data streams.


Pydap is a pure Python library implementing the Data Access Protocol, also known as DODS or OPeNDAP. You can use Pydap as a client to access hundreds of scientific datasets in a transparent and efficient way through the internet; or as a server to easily distribute your data from a variety of formats.

Pydap includes several handlers, i.e. special Python modules that convert between a given data format and the data model used by Pydap (defined in the pydap.model module). They are necessary in order to Pydap be able to actually serve a dataset. There are handlers for NetCDF, HDF 4 & 5, Matlab, relational databases, Grib 1 & 2, CSV, Seabird CTD files, and a few more.


The goal of the OpenDSA project is to create open-source courseware for use in Data Structures and Algorithms courses, that deeply integrates textbook-quality content with algorithm visualizations and interactive, automatically assessed exercises.


OpenFL is a free and open source software framework and platform for the creation of multi-platform applications and video games. OpenFL programs are written in a single language (Haxe) and may be published to Flash movies, or standalone applications for Microsoft Windows, Mac OS X, Linux, iOS, Android, BlackBerry OS, Firefox OS, HTML5 and Tizen.

OpenFL is designed to mimic Adobe Flash Player, and provides much of the same functionality and API. SWF files created with Adobe Flash Professional or other authoring tools may be used in OpenFL programs.


OpenNL (Open Numerical Library) is a library for solving sparse linear systems, especially designed for the Computer Graphics community. The goal for OpenNL is to be as small as possible, while offering the subset of functionalities required by this application field. The Makefiles of OpenNL can generate a single .c + .h file, very easy to integrate in other projects. The distribution includes an implementation of our Least Squares Conformal Maps parameterization method. It includes support for CUDA and Fermi architecture (Concurrent Number Cruncher and Nathan Bell’s ELL formats.


Description: OpenSim is a freely available, user extensible software system that lets users develop models of musculoskeletal structures and create dynamic simulations of movement.

OpenSim 3.2 includes an improved scripting interface, accessible through the Graphical User Interface (GUI), Matlab, and now Python. We also added new visualization capabilities and usability improvements in the OpenSim application.


OpenStack is a cloud operating system that controls large pools of compute, storage, and networking resources throughout a datacenter, all managed through a dashboard that gives administrators control while empowering their users to provision resources through a web interface.

OpenStack is a free and open-source cloud computing software platform.[2] Users primarily deploy it as an infrastructure as a service (IaaS) solution. The technology consists of a series of interrelated projects that control pools of processing, storage, and networking resources throughout a data center—which users manage through a web-based dashboard, command-line tools, or a RESTful API. released it under the terms of the Apache License.


Neutron is an OpenStack project to provide "networking as a service" between interface devices (e.g., vNICs) managed by other Openstack services (e.g., nova).


OpenUH is an open source, optimizing compiler suite for C, C++ and Fortran, based on Open64. It supports a variety of architectures including x86-64, IA-32, IA-64, MIPS, and PTX.

OpenUH extends the Open64 OpenMP implementation by adding support for nested parallelism and the tasking features introduced in OpenMP 3.0. The OpenMP runtime library that comes with OpenUH supports several task scheduling strategies, enables selection of more scalable barrier algorithms, and provides an implementation of the OpenMP Collector API for interaction with performance collection tools (including DARWIN). The OpenMP implementation has been successfully tested using a number of applications and validated with the NAS Parallel Benchmarks (NPB) and our OpenMP Validation Suite, developed in collaboration with the High Performance Computing Center Stuttgart (HLRS) from the University of Stuttgart. OpenUH also provides support for Fortran coarrays, an extension that has been adopted in the Fortran 2008 standard. With the use of coarrays, a programmer can easily write parallel Fortran programs for a variety of parallel systems. The OpenUH CAF implementation can work in conjunction with either the GASNet or ARMCI runtime libraries, open-source projects which are freely downloadable online.

To achieve portability, OpenUH is able to emit optimized C or Fortran 77 code that may be compiled by a native compiler on other platforms. The supporting runtime libraries are also portable - the OpenMP runtime library is based on the portable Pthreads interface while the Coarray Fortran runtime library is based on the portable GASNet (or, optionally, ARMCI) communications interfaces.


The openZIM project proposes offline storage solutions for content coming from the Web. The project has two different targets:

  • Definition of the ZIM file format: an open and standardized file format,

  • Implementation of the zimlib: an open source (GPLv2) implementation of the ZIM file format.

See also Kiwix and Internet-in-a-Box.


Orange File System is a branch of the Parallel Virtual File System. Like PVFS, Orange is a parallel file system designed for use on high end computing (HEC) systems that provides very high performance access to disk storage for parallel applications. OrangeFS is different from PVFS in that we have developed features for OrangeFS that are not presently available in the PVFS main distribution. While PVFS development tends to focus on specific very large systems, Orange considers a number of areas that have not been well supported by PVFS in the past.

OrangeFS is presently integrated with ROMIO through MPICH2, and includes FUSE support. It has also been integrated with pNFS.


Orcc is an open-source Integrated Development Environment based on Eclipse and dedicated to dataflow programming. The primary purpose of Orcc is to provide developers with a compiler infrastructure to allow software/hardware code to be generated from dataflow descriptions. Orcc does not generate assembly or executable code directly, rather it generates source code that must be compiled by another tool.

Orcc also brings a complete Java-based simulator which allows developers to quickly test their applications without taking in consideration low-level details relative to the target platform. The simulator can be launched directly from eclipse to execute any RVC-CAL application. Indeed, the simulator simply interprets our intermediate representation of networks and actors, but it is however able to perform all basic interactions required to perform a functional validation, such as displaying text, images or videos to the screen.


Orc-apps is a library of open-source applications described in a dynamic dataflow programming way, using the RVC-CAL and FNL languages. The applications are fully compliant with the Orcc toolset.


ORCM was originally developed as an open-source project (under the Open MPI license) by Cisco Systems, Inc to provide a resilient, 100% uptime run-time environment for enterprise-class routers. Based on the Open Run-Time Environment (ORTE) embedded in Open MPI, the system provided launch and execution support for processes executing within the router itself (e.g., computing routing tables), ensuring that a minimum number of copies of each program were always present. Failed processes were relocated based on the concept of fault groups - i.e., the grouping of nodes with common failure modes. Thus, ORCM attempted to avoid cascade failures by ensuring that processes were not relocated onto nodes with a high probability of failing in the immediate future.

The Cisco implementation naturally required a significant amount of monitoring, and included the notion of fault prediction as a means of taking pre-emptive action to relocate processes prior to their node failing. This was facilitated using an analytics framework that allowed users to chain various analysis modules in the data pipeline so as to perform in-flight data reduction.

Subsequently, ORCM was extended by Greenplum to serve as a scalable monitoring system for Hadoop clusters. While ORCM itself had run on quite a few "nodes" in the Cisco router, and its base ORTE platform has been used for years on very large clusters involving many thousands of nodes, this was the first time the ORCM/ORTE platform had been used solely as a system state-of-health monitor with no responsibility for process launch or monitoring. Instead, ORCM was asked to provide a resilient, scalable monitoring capability that tracked process resource utilization and node state-of-health, collecting all the data in a database for subsequent analysis. Sampling rates were low enough that in-flight data reduction was not required, nor was fault prediction considered to be of value in the Hadoop paradigm.


Ori is a distributed file system built for offline operation and empowers the user with control over synchronization operations and conflict resolution. We provide history through light weight snapshots and allow users to verify the history has not been tampered with. Through the use of replication instances can be resilient and recover damaged data from other nodes.


An open-source extensible framework for the definition of domain-specific languages and generation of optimized (C, Fortran, CUDA, OpenCL) code for multiple architecture targets (e.g., CPUs, NVIDIA and AMD GPUs, Intel Phi), including support for empirical autotuning of the generated code.

Orio is a Python framework for transformation and automatically tuning the performance of codes written in different source and target languages, including transformations from a number of simple languages (e.g., a restricted subset of C) to C, Fortran, CUDA, and OpenCL targets. The tool generates many tuned versions of the same operation using different optimization parameters, and performs an empirical search for selecting the best among multiple optimized code variants.


ORSA is an interactive tool for scientific grade Celestial Mechanics computations. Asteroids, comets, artificial satellites, Solar and extra-Solar planetary systems can be accurately reproduced, simulated, and analyzed. One of the main goals is to create a common infrastructure among the existing celestial mechanics programs and standards. The features include:

  • accurate numerical algorithms

  • use of JPL ephemeris files for accurate planets positions

  • Qt-based graphical user interface

  • advanced 2D plotting tool and 3D OpenGL viewer

  • import asteroids and comets from all the known databases (MPC, JPL, Lowell, AstDyS, and NEODyS)

  • integrated download tool to update databases

  • stand alone numerical library liborsa


RendezvousWithVesta is a graphical software tool developed by Pasquale Tricarico at the Planetary Science Institute, in support to the NASA DAWN mission. It allows to accurately simulate the dynamics of a spacecraft orbiting the asteroid (4) Vesta. The motivations for developing this tool are (1) understand how the physical parameters of Vesta affect the stability of low polar orbits; (2) understand how the physical parameters of Vesta and the orbital elements of DAWN affect the coverage of Vesta’s surface; and (3) provide a fast and reliable tool for the generation of orbits suitable for input in the Science Opportunity Analyzer (SOA) tool.

The features include:

  • validated numerical algorithms, tested on NEAR mission data, and capable of accurately reproducing NEAR’s orbit around Eros;

  • complete control over Vesta’s physical properties: mass, mass distribution model, shape model, rotation period, and pole ecliptic latitude and longitude;

  • control over DAWN’s initial orbit around Vesta: epoch, radius, equatorial (Vesta’s equator) inclination, phase angle;

  • export simulations as SPICE kernel files and as ASCII data files;

  • 3D graphical visualization of the numerical simulation, including the ground tracking of DAWN over Vesta’s surface;

  • 2D plot of the altitude of the spacecraft and of the Vesta profile at nadir; and;

  • completely open source and part of the ORSA framework.


SurfaceCoverage is a graphical software tool developed by Pasquale Tricarico at the Planetary Science Institute, in support to the NASA Dawn mission. It allows to estimate the coverage of the surface of asteroid (4) Vesta by the Dawn spacecraft in a wide range of configurations.


OSTree is a tool for managing bootable, immutable, versioned filesystem trees. It is not a package system; nor is it a tool for managing full disk images. Instead, it sits between those levels, offering a blend of the advantages (and disadvantages) of both.

You can use any build system you like to place content into it on a build server, then export an OSTree repository via static HTTP. On each client system, "ostree admin upgrade" can incrementally replicate that content, creating a new root for the next reboot. This provides fully atomic upgrades. Any changes made to /etc are propagated forwards, and all local state in /var is shared.

A key goal of the project is to complement existing package systems like RPM and Debian packages, and help further their evolution. In particular for example, RPM-OSTree (linked below) has as a goal a hybrid tree/package model, where you replicate a base tree via OSTree, and then add packages on top.


An open source document mining platform. Read and analyze thousands of documents super quickly. Full text search, topic modeling, coding and tagging, visualizations and more. All in an easy-to use, visual workflow.


ownCloud provides access to your data through a web interface or WebDAV while providing a platform to view, sync and share across devices easily—all under your control. ownCloud’s open architecture is extensible via a simple but powerful API for applications and plugins and works with any storage.


A Python package for arithmetical computations on random variables. The package is capable of performing the four arithmetic operations: addition, subtraction, multiplication and division, as well as computing many standard functions of random variables. Summary statistics, random number generation, plots, and histograms of the resulting distributions can easily be obtained and distribution parameter fitting is also available. The operations are performed numerically and their results interpolated allowing for arbitrary arithmetic operations on random variables following practically any probability distribution encountered in practice. The package is easy to use, as operations on random variables are performed just as they are on standard Python variables. Independence of random variables is, by default, assumed on each step but some computations on dependent random variables are also possible. We demonstrate on several examples that the results are very accurate, often close to machine precision. Practical applications include statistics, physical measurements or estimation of error distributions in scientific computations.


PadicoTM is the runtime infrastructure for the Padico software environment for computational grids. It is composed of a core which provides a high-performance framework for networking and multi-threading, and services plugged into the core. High-performance communications and threads are obtained thanks to Marcel and Madeleine, provided by the PM2 software suite. The PadicoTM core aims at making the different services running at the same time run in a cooperative way rather than competitive.

PadicoTM exhibits standard interface (VIO: virtual sockets; Circuit: Madeleine-like API; etc.) usable by various middleware systems. Thanks to symbol interception by PadicoTM, middleware is unmodified and utilizes PadicoTM communication methods seamlessly. The middleware systems available over PadicoTM are:

  • CORBA implementations: omniORB and Mico

  • the MPI implementations NewMadeleine and GridMPI

  • a Java Virtual Machine based on Kaffe

  • the gSOAP SOAP/Web services development toolkit

  • an implementation of the JXTA P2P specifications called JXTA-C


An open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language.

The two primary data structures of pandas, Series (1-dimensional) and DataFrame (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. For R users, DataFrame provides everything that R’s data.frame provides and much more. pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other 3rd party libraries.


GeoCoon is GIS data analysis Python library, which integrates Pandas data frames with Shapely GIS geometries. The library provides means to load GIS data in a form of Shapely objects into Pandas data frame and analyze data using Pandas idioms. It allows to access attributes and call methods on Shapely geometries in a vectorized manner.

GeoCoon supports:

  • Point, line string and polygon geometries.

  • Vectorized GIS object attribute access and method execution

  • Pandas data selection and split-apply-combine idioms.

  • SQL/MM databases, i.e. PostgreSQL with PostGIS extension

  • Multiple geometry columns in a data frame


GeoPandas is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types. Geometric operations are performed by shapely. Geopandas further depends on fiona for file access and descartes and matplotlib for plotting.


PaPy, which stands for parallel pipelines in Python, is a highly flexible framework that enables the construction of robust, scalable workflows for either generating or processing voluminous datasets. A workflow is created from user-written Python functions (nodes) connected by pipes (edges) into a directed acyclic graph. These functions are arbitrarily definable, and can make use of any Python modules or external binaries. Given a user-defined topology and collection of input data, functions are composed into nested higher-order maps, which are transparently and robustly evaluated in parallel on a single computer or on remote hosts. Local and remote computational resources can be flexibly pooled and assigned to functional nodes, thereby allowing facile load-balancing and pipeline optimization to maximize computational throughput. Input items are processed by nodes in parallel, and traverse the graph in batches of adjustable size — a trade-off between lazy-evaluation, parallelism, and memory consumption. The processing of a single item can be parallelized in a scatter/gather scheme. The simplicity and flexibility of distributed workflows using PaPy bridges the gap between desktop → grid, enabling this new computing paradigm to be leveraged in the processing of large scientific datasets.


A Python CDM for met/ocean data.

Parallel For

A data parallel scientific programming model.

Compiles efficiently to different parallel architectures like distributed memory with message passing [MPI, PVM] and one-sided communication [MPI-2, shmem], shared memory multi-processor and multi-core processors [POSIX threads, OpenMP, boost threads, Intel TBB], procedure off-loading [Nvidia Cuda, Cell BE, AMD Brook, OpenCL], SIMD vectorization [SSE and AltiVec], and sequential C++ code.

Transform inherrent parallel expressions into efficient parallel C++ code.


PARALUTION is a library that enables you to perform various sparse iterative solvers and preconditioners on multi/many-core CPU and GPU devices. Based on C++, it provides a generic and flexible design that allows seamless integration with other scientific software packages.

PARALUTION contains Krylov subspace solvers (CR, CG, BiCGStab, GMRES, IDR), Multigrid (GMG, AMG), Deflated PCG, Fixed-point iteration schemes, Mixed-precision schemes and fine-grained parallel preconditioners based on splitting, ILU factorization with levels, multi-elimination ILU factorization, additive Schwarz and approximate inverse. The library also provides iterative eigenvalue solvers.

The library can be compiled under Linux/Unix-like , Windows and Mac OS. PARALUTION provides multi-core CPU/Host (OpenMP), NVIDIA GPU (CUDA, OpenCL), AMD GPU (OpenCL), Intel Xeon Phi/MIC (OpenCL, OpenMP/offload mode) support, including VS (Visual Studio) gcc (GNU C) and icc (Intel C) compilers.


ParaViewWeb is a collection of components that enables the use of ParaView’s visualization and data analysis capabilities within Web applications. Using the latest HTML 5.0 based technologies, such as WebSocket, and WebGL, ParaViewWeb enables communiation with a ParaView server runnning on a remote visualization node or cluster using a light-weight JavaScript API. Using this API, Web applications can easily embed interactive 3D visualization components. Application developers can write simple Python scripts to extend the server capabilities including creating custom visualization pipelines.


The PA.rticle SY.stems V.isualization and A.nalysis T.ool is a program able to perform unique visual analysis on generic particle systems: PASYVAT (PArticle SYstem Visual Analysis Tool). More specifically, it can perform a selection of multiple interparticle distance ranges from a radial distribution function (RDF) plot and display them in 3D as bonds. This software can be used with any data set representing a system of particles in 3D. In this manuscript the reader will find a description of the program and its internal structure, with emphasis on its applicability in the study of certain particle configurations, obtained from classical molecular dynamics simulation in condensed matter physics.


The "Programming with Big Data in R" project (pbdR) enables high-level distributed data parallelism in R, so that it can easily utilize large HPC platforms with thousands of cores, making the R language scale to unparalleled heights. We interpret big data quite literally to mean that its size requires parallel processing either because it does not fit in the memory of a single multicore machine or because we need to make its processing time tolerable.

We achieve this, in part, by providing a simple interface to scalable, high performance libraries, such as MPI, ScaLAPACK, and NetCDF4. The routines in these libraries are engaged through R’s classes and methods, so that the R language syntax is largely preserved, but with new, scalable, compiled code underneath. Most of the cumbersome distributed details are abstracted away for the user, although they are readily accessible should the user desire them.


PaStiX (Parallel Sparse matriX package) is a scientific library that provides a high performance parallel solver for very large sparse linear systems based on direct methods. Numerical algorithms are implemented in single or double precision (real or complex) using LLt, LDLt and LU with static pivoting (for non symmetric matrices having a symmetric pattern). This solver provides also an adaptive blockwise iLU(k) factorization that can be used as a parallel preconditioner using approximated supernodes to build a coarser block structure of the incomplete factors.


PCA that iteratively replaces missing data. An implementaton of probabilisitc principal components analysis which is a variant of vanilla PCA that can be used to compute factors where some of the data are missing, and interpolate data by using information from additional series.


PeerVPN is a software that builds virtual ethernet networks between multiple computers. Such a virtual network can be useful to facilitate direct communication that applications like file sharing or gaming may need. Often, such direct communication is made impossible or very difficult by firewalls or NAT devices.

Most traditional VPN solutions follow the client-server principle, which means that all participating nodes connect to a central server. This creates a star topology, which has some disadvantages. The central node needs lots of bandwith, because it needs to handle all the VPN traffic. Also, if the central node goes down, the whole VPN is down too.

A virtual network built by PeerVPN uses a full mesh topology. All nodes talk directly to each other, there is no need for a central server. If one node goes down, the rest of the network is unaffected.


PEGASUS is a Peta-scale graph mining system, fully written in Java. It runs in parallel, distributed manner on top of Hadoop. Hadoop is a cloud computing platfrom, as well as an open source implementation of MapReduce framework which was originally designed for web-scale data processing by Google.

Existing works on graph mining has limited scalability: usually, the maximum graph size is order of millions. PEGASUS breaks the limit by scaling up the algorithms to billion-scale graphs. The breakthrough was possible by the careful algorithm design and implementation for Hadoop, a massive cloud computing platform.


This software framework implements a NURBS-based Galerkin finite element method (FEM), popularly known as isogeometric analysis (IGA). It is heavily based on PETSc, the Portable, Extensible Toolkit for Scientific Computation. PETSc is a collection of algorithms and data structures for the solution of scientific problems, particularly those modeled by partial differential equations (PDEs). PETSc is written to be applicable to a range of problem sizes, including large-scale simulations where high performance parallel is a must. PetIGA can be thought of as an extension of PETSc, which adds the NURBS discretization capability and the integration of forms. The PetIGA framework is intended for researchers in the numeric solution of PDEs who have applications which require extensive computational resources.


A toolkit for development of scientific applications related to processing observational data. It includes:

  • Fast linear algebra routines, including one of the fastest subroutine for inversion of a square symmetric positively determined matrix in the upper triangular representation.

  • Graphic library DiaGI (Dialog Graphic Interface) which makes a plot of one-dimensional function(s) from one call and allows a user to adjust parameters of the plot interactively.

  • Routine MatView which displays a portion of a big matrix on the screen and allows a user to change the boundaries of the displayed area interactively.

  • A set of routines for manipulation with splines; various routines for multi-dimensional B-spline transform, etc.

  • Various routines for least squares, regression computation, error handler, interface to a low level I/O, date transformation etc.

See fourpack.


Petuum is a distributed machine learning framework. It aims to provide a generic algorithmic and systems interface to large scale machine learning, and takes care of difficult systems "plumbing work" and algorithmic acceleration, while simplifying the distributed implementation of ML programs - allowing you to focus on model perfection and Big Data Analytics. Petuum runs efficiently at scale on research clusters and cloud compute like Amazon EC2 and Google GCE.


The primary goal of the PHAML (Parallel Hierarchical Adaptive MultiLevel method) project is to develop new methods and software for the efficient solution of 2D elliptic partial differential equations (PDEs) on distributed memory parallel computers and multicore computers using adaptive mesh refinement and multigrid solution techniques.


Pharo is a pure object-oriented programming language and a powerful environment, focused on simplicity and immediate feedback (think IDE and OS rolled into one).


PHCpack is a software package to solve polynomial systems by homotopy continuation methods.

A polynomial system is given as a sequence of polynomials in several variables. Homotopy continuation methods operate in two stages. In the first stage, a family of polynomial systems (the so-called homotopy) is constructed. This homotopy contains a polynomial system with known solutions. In the second stage, numerical continuation methods are applied to track the solution paths defined by the homotopy, starting at the known solutions and leading to the solutions of the given polynomial system.


This documentation describes a collection of Python modules to compute solutions of polynomial systems using PHCpack.


A toolbox for developing parallel adaptive finite element programs. PHG deals with conforming tetrahedral meshes and uses bisection for adaptive local mesh refinement and MPI for message passing. PHG has an object oriented design which hides parallelization details and provides common operations on meshes and finite element functions in an abstract way, allowing the users to concentrate on their numerical algorithms.

PHG has a set of rich and easy to use interfaces to other packages, including ParMETIS, PETSc, Hypre, SuperLU, MUMPS, Trilinos, PARPACK, JDBSYM, LOBPCG.


Picat is a simple, and yet powerful, logic-based multi-paradigm programming Qt-like (and Qt compatible!) signal and slot mechanism you can easily register notifications.

As ZMQ is used as the transport layer, communication is fast and efficient, and different protocols are supported. It has a complete test coverage. It runs in Python 3.2+ and requires PyZMQ. It is licensed under BSD.

Platform MPI

IBM® Platform MPI Community Edition is a no-charge community edition of IBM Platform MPI supporting the core MPI features. It is available for download, deployment, and redistribution at no charge. This edition is simple, flexible, powerful, and reliable; easy to install, embed, deploy; embodies core capabilities of Platform MPI for Linux® and Windows®; and provides an optional low cost offering that includes higher rank counts, 24/7 IBM customer support, fix packs, and upgrade protection.


A simple, accessible HTML5 media player.


A low-level generic runtime system which integrates multithreading management and a high performance multi-cluster communication library. PM2 is an umbrella software suite for high-performance runtime systems. Modules may be installed and used together or separately. The modules are:

  • NewMadeleine - a high performance communication library for clusters

  • PIOMan - a generic I/O manager designed to deal with interactions between communication and multithreading

  • PadicoTM - a component-based high performance communication framework for grid computing that enables a wide variety of middleware systems, e.g. MPI, CORBA, Java RMI, ICE, SOAP, etc.

  • Marcel - a thread library developed to meet the needs of the PM2 multithreaded environment


The Process Management Interface (PMI) has been used for quite some time as a means of exchanging wireup information needed for interprocess communication. Two versions (PMI-1 and PMI-2) have been released as part of the MPICH effort. While PMI-2 demonstrates better scaling properties than its PMI-1 predecessor, attaining rapid launch and wireup of the roughly 1M processes executing across 100k nodes expected for exascale operations remains challenging.

PMI Exascale (PMIx) represents an attempt to resolve these questions by providing an extended version of the PMI standard specifically designed to support clusters up to and including exascale sizes. The overall objective of the project is not to branch the existing pseudo-standard definitions - in fact, PMIx fully supports both of the existing PMI-1 and PMI-2 APIs - but rather to (a) augment and extend those APIs to eliminate some current restrictions that impact scalability, and (b) provide a reference implementation of the PMI-server that demonstrates the desired level of scalability.


Pochoir (pronounced "PO-shwar") is a compiler and runtime system for implementing stencil computations on multicore processors. A stencil defines the value of a grid point in a d-dimensional spatial grid at time t as a function of neighboring grid points at recent times before t. A stencil computation computes the stencil for each grid point over many time steps. Using Pochoir, a user specifies a computing kernel and boundary conditions using a simple stencil language embedded in C. The Pochoir compiler produces cache-efficient multithreaded C code that can be compiled with the Intel 12.0 compiler for C with the Cilk multithreading extensions, which is available as part of the Intel Parallel Computer suite. The Pochoir package contains two main components: a C template library for debugging and testing Pochoir compliance and a domain-specific compiler written in Haskell that produces highly optimized code.


The Poincaré code is a Maple project package that aims to gather significant computer algebra normal form (and subsequent reduction) methods for handling nonlinear ordinary differential equations. As a first version, a set of fourteen easy-to-use Maple commands is introduced for symbolic creation of (improved variants of Poincaré’s) normal forms as well as their associated normalizing transformations. The software is the implementation by the authors of carefully studied and followed up selected normal form procedures from the literature, including some authors’ contributions to the subject. As can be seen, joint-normal-form programs involving Lie-point symmetries are of special interest and are published in CPC Program Library for the first time, Hamiltonian variants being also very useful as they lead to encouraging results when applied, for example, to models from computational physics like Hénon–Heiles.


Polly is a high-level loop and data-locality optimizer and optimization infrastructure for LLVM. It uses an abstract mathematical representation based on integer polyhedra to analyze and optimize the memory access pattern of a program. We currently perform classical loop transformations, especially tiling and loop fusion to improve data-locality. Polly can also exploit OpenMP level parallelism, expose SIMDization opportunities. Work has also be done in the area of automatic GPU code generation.


A tool to study the combinatorics and the geometry of convex polytopes and polyhedra. It is also capable of dealing with simplicial complexes, matroids, polyhedral fans, graphs, tropical objects, and other objects.


Pomegranate is an open source Python application that implements the open Webification (w10n) Science API for major scientific data stores (HDF, NetCDF, etc.). It makes file inner components, such attributes and data arrays, directly addressable and accessible via well-defined and meaningful URLs.

Data exposed by w10n-sci API is readily consumable by any HTTP client. It can be as simple as a command line like curl or wget, or as advanced as a full-fledged HTML5 web application such as REX.

Pomegranate has been included in Taiga, a turnkey software tool that simplifies the use of scientific data.

It can be installed as a command line tool and/or a ReSTful web service.

Source code is available at Open Channel Software. However, please note that Pomegranate alone won’t be enough to establish a w10n-sci service. What you really need is this instruction service-setup.txt, that details the steps necessary to build, install and configure for a complete service. Or rather use a turnkey solution like Taiga, so that you can be up and running in minutes.


We investigate performance improvements for the discrete element method (DEM) used in ppohDEM. First, we use OpenMP and MPI to parallelize DEM for efficient operation on many types of memory, including shared memory, and at any scale, from small PC clusters to supercomputers. We also describe a new algorithm for the descending storage method (DSM) based on a sort technique that makes creation of contact candidate pair lists more efficient. Finally, we measure the performance of ppohDEM using the proposed improvements, and confirm that computational time is significantly reduced. We also show that the parallel performance of ppohDEM can be improved by reducing the number of OpenMP threads per MPI process.


Precimonious employs a dynamic program analysis technique to find a lower floating-point precision that can be used in any part of a program. Precimonious performs a search on the program variables trying to lower their precision subject to accuracy constraints and performance goals. The tool then recommends a type instantiation for these variables using less precision while producing an accurate enough answer without causing exceptions.


PREESM is an open source rapid prototyping tool. It simulates signal processing applications and generates code for heterogeneous multi/many-core embedded systems. Its dataflow language eases the description of parallel signal processing applications.

The PREESM tool inputs are an algorithm graph, an architecture graph, and a scenario which is a set of parameters and constraints that specify the conditions under which the deployment will run. The chosen type of algorithm graph is a parameterized and hierarchical extension of Synchronous Dataflow (SDF) graphs named PiSDF. The architecture graph is named System-Level Architecture Model (S-LAM). From these inputs, PREESM maps and schedules automatically the code over the multiple processing elements and generates multi-core code.

PREESM is an Eclipse plug-in.


Describe your software project just once, using Premake’s simple and easy to read syntax, and build it everywhere.

Generate project files for Visual Studio, GNU Make, Xcode, Code:Blocks, and more across Windows, Mac OS X, and Linux. Use the full featured Lua scripting engine to make build configuration tasks a breeze.


PRIMME is a C library to find a number of eigenvalues and their corresponding eigenvectors of a Real Symmetric, or Complex Hermitian matrix A. Symmetric and Hermitian eigenvalue problems enjoy a remarkable theoretical structure that allows for efficient and stable algorithms for obtaining a few required eigenpairs. This is probably one of the reasons that enabled applications requiring the solution of symmetric eigenproblems to push their accuracy and thus computational demands to unprecedented levels. Materials science, structural engineering, and some QCD applications routinely compute eigenvalues of matrices of dimension more than a million; and often much more than that! Typically, with increasing dimension comes increased ill conditioning, and thus the use of preconditioning becomes essential.[]


A programming language, development environment, and online community. Since 2001, Processing has promoted software literacy within the visual arts and visual literacy within technology. Initially created to serve as a software sketchbook and to teach computer programming fundamentals within a visual context, Processing evolved into a development tool for professionals.

Processing continues to be an alternative to proprietary software tools with restrictive and expensive licenses, making it accessible to schools and individual students. Its open source status encourages the community participation and collaboration that is vital to Processing’s growth. Contributors share programs, contribute code, and build libraries, tools, and modes to extend the possibilities of the software. The Processing community has written more than a hundred libraries to facilitate computer vision, data visualization, music composition, networking, 3D file exporting, and programming electronics.


A collection of classes that performs the heavy lifting for you by writing a minimal amount of code. This library is compatible with both Processing and Processing.js


Processing.js is the sister project of the popular Processing visual programming language, designed for the web. Processing.js makes your data visualizations, digital art, interactive animations, educational graphs, video games, etc. work using web standards and without any plug-ins. You write code using the Processing language, include it in your web page, and Processing.js does the rest.


This project provides a Python package that creates an environment for graphics applications that closely resembles that of the Processing system. The project mission is to implement Processing’s friendly graphics functions and interaction model in Python. Not all of Processing is to be ported, though, since Python itself already provides alternatives for many features of Processing, such as XML parsing. The pyprocessing backend is built upon OpenGL and Pyglet, which provide the actual graphics rendering. Since these are multiplatform, so is pyprocessing.


An independent, open source library collection for computational design tasks with Java & Processing. The classes are purposefully kept fairly generic in order to maximize re-use in different contexts ranging from generative design, animation, interaction/interface design, data visualization to architecture and digital fabrication, use as teaching tool and more.

Programming Languages

Languages of special interest.


A little language for machines with Speech Acts inspired by Elephant 2000. The parser uses the wonderful Clojure Instaparse library. The language aims to have syntactically sugared "speech acts" that the machine uses as inputs and outputs. The language also supports beliefs and goals from McCarthy’s paper, Ascribing Mental Qualities to Machines.


Eon is the first energy-aware programming language. It is a declarative coordination language and runtime system designed to simplify the development of perpetual systems by separating program logic from energy management. Using Eon, the system designer describes program operation as well as how the program can be adjusted in order to conserve energy. During operation the Eon runtime system automatically adjust the program in order to sustain operation based on online measurements of energy harvest and per-task energy consumption.


Esterel is a programming language dedicated to control-dominated reactive systems, such as control circuits, embedded systems, human-machine interface, or communication protocols.


Halide is a new programming language designed to make it easier to write high-performance image processing code on modern machines. Its current front end is embedded in C++. Compiler targets include x86/SSE, ARM v7/NEON, CUDA, Native Client, and OpenCL.


HANSEI is the the embedded domain-specific language for probabilistic programming: for writing potentially infinite discrete-distribution models and performing exact inference, importance sampling and inference of inference.

HANSEI is an ordinary OCaml library, with probability distributions represented as ordinary OCaml programs. Delimited continuations let us reify non-deterministic programs as lazy search trees, which we may then traverse, explore, or sample. Thus an inference procedure and a model invoke each other as co-routines. Thanks to the delimited control, deterministic expressions look exactly like ordinary OCaml expressions, and are evaluated as such, without any overhead.


The Heterogeneous Image Processing Acceleration Framework allow the design of image processing kernels and algorithms in a domain-specific language (DSL). From this high-level description, low-level target code for GPU accelerators is generated using source-to-source translation. As back ends, the framework supports CUDA, OpenCL, and Renderscript.


Jolie is an open-source[1] programming language for developing distributed applications based on microservices. In the programming paradigm proposed with Jolie, each program is a service that can communicate with other programs by sending and receiving messages over a network. Jolie supports an abstraction layer that allows services to communicate different mediums, ranging from TCP/IP sockets to local in-memory communications between processes. Jolie is currently supported by an interpreter implemented in the Java language, which can be run in multiple operating systems. Since it supports the orchestration of Web Services, Jolie is an alternative to XML-based orchestration languages such as WS-BPEL as it offers a concise (C-like) syntax for accessing XML-like data structures.


Language and compiler for image processing graphs (specific language + C) into a single merged OpenCL kernel tuned for the target many-core architecture.


The Mozart Programming System combines ongoing research in programming language design and implementation, constraint logic programming, distributed computing, and human-computer interfaces. Mozart implements the Oz language and provides both expressive power and advanced functionality. Mozart excels in creating distributed, concurrent applications, because it makes a network fully transparent. It supports GUI applications through Tcl/Tk integration, because it runs applications in a virtual machine: applications can be developed once and run on many different platforms.

The PLDC Research Group at UCL is proud to announce the first release of Mozart 2. This release contains a completely redesigned 64-bit virtual machine (compatible with 32-bit and 64-bit processors), and adds an extension interface to the virtual machine to allow language extensions defined within Oz. The PLDC Research Group will use Mozart 2 for future programming education and future research in programming language design and implementation.

The first release of Mozart 2 does not provide support for constraints or distributed programming. We plan for successive releases to support constraint programming with an interface to the Gecode system, and to support distributed programming with a peer-to-peer transactional storage and extensions for network-transparent and synchronization-free programming.


Oberon is a general-purpose programming language created in 1986 by Professor Niklaus Wirth and the latest member of the Wirthian family of ALGOL-like languages (Euler, Algol-W, Pascal, Modula, and Modula-2). Oberon was the result of a concentrated effort to increase the power of Modula-2, the direct successor of Pascal, and simultaneously to reduce its complexity. Its principal new feature is the concept of type extension of record types:[1] It permits the construction of new data types on the basis of existing ones and to relate them, deviating from the dogma of strictly static data typing.

The new System since 2008 is now called A2. A2 is the name of a modern integrated software environment. It is a single-user, multi-tasking system that runs on bare hardware or on top of a host operating system.


Pharo is an open source implementation of the programming language and environment Smalltalk. Pharo offers strong live programming features such as immediate object manipulation, live update, hot recompilation. Live programming environment is in the heart of the system. Pharo also supports advanced web development with frameworks such as Seaside and more recently Tide.


Seaside provides a layered set of abstractions over HTTP and HTML that let you build highly interactive web applications quickly, reusably and maintainably. It is based on Smalltalk.


Twelf is a language used to specify, implement, and prove properties of deductive systems such as programming languages and logics. Twelf is a piece of computer software, and it is also a computer language understood by the Twelf software. C code and Java code describe programs, HTML code describes graphical web pages, and Twelf code describes logical systems.

The reason someone might want to use Twelf code to describe a logical system is that once they’ve described it, they can write more Twelf code that uses that logical system. You could use Twelf to write out a statement about basic arithmetic (for instance, “if a + b = c, then b + a = c”), and then use Twelf to write out a justification of why that statement is true (i.e. a proof). When you do so, Twelf will check your proof, making sure that what you said actually is true!

It turns out that while basic arithmetic, set theory, and interesting logics are logical systems, programming languages are also logical systems - and Twelf has a couple of unique features that make it a great tool to use when the logical systems you are working with are programming languages.


We describe an implementation to solve Poissonʼs equation for an isolated system on a unigrid mesh using FFTs. The method solves the equation globally on mesh blocks distributed across multiple processes on a distributed-memory parallel computer. Test results to demonstrate the convergence and scaling properties of the implementation are presented. The solver is offered to interested users as the library PSPFFT.


The Parallel Ultra-Light Systolic Array Runtime (PULSAR), now in version 2.0, is a complete programming platform for large-scale distributed memory systems with multicore processors and hardware accelerators. PULSAR provides a simple abstraction layer over multithreading, message-passing, and multi-GPU, multi-stream programming. PULSAR offers a general-purpose programming model, suitable for a wide range of scientific and engineering applications.

This simple programming model allows the user to define the computation in the form of a Virtual Systolic Array (VSA), which is a set of Virtual Data Processors (VDPs), and is connected with data channels. This programming model is also accessible to the user through a very small and simple Application Programming Interface (API), and all the complexity of executing the workload on a large-scale system is hidden in the runtime implementation.

The runtime supports distributed memory systems with multicore processors and relies on POSIX Threads (a.k.a. Pthreads) for intra-node multithreading, and on the Message Passing Interface (MPI) for inter-node communication. The runtime also supports multiple Nvidia GPU accelerators, in each distributed memory node, using the Compute Unified Device Architecture (CUDA) platform.

Pure Data

Pure Data (aka Pd) is an open source visual programming language. Pd enables musicians, visual artists, performers, researchers, and developers to create software graphically, without writing lines of code. Pd is used to process and generate sound, video, 2D/3D graphics, and interface sensors, input devices, and MIDI. Pd can easily work over local and remote networks to integrate wearable technology, motor systems, lighting rigs, and other equipment. Pd is suitable for learning basic multimedia processing and visual programming methods as well as for realizing complex systems for large-scale projects.

Pd is a so-called data flow programming language, where software called patches are developed graphically. Algorithmic functions are represented by objects, placed on a screen called canvas. Objects are connected together with cords, and data flows from one object to another through this cords. Each object performs a specific task, from very low level mathematic operations to complex audio or video functions such as reverberation, fft transform, or video decoding.


Linux-centric monolithic distribution based on pd-extended with focus on solid/stable core, enhancements, and usability features including infinite undo, gui-based iemgui object editing, accelerated visual editor and gui operations, improved appearance, K12 education mode, and more. The distribution is developed for and maintained by Virginia Tech’s Linux Laptop Orchestra (L2Ork).


Pushpin is a new way to build realtime HTTP and WebSocket services.


A collection of standard atmospheric and oceanic sciences routines.


The Python ARM Radar Toolkit, Py-ART, is an open source Python module containing a growing collection of weather radar algorithms and utilities build on top of the Scientific Python stack and distributed under the 3-Clause BSD license. Py-ART is used by the Atmospheric Radiation Measurement (ARM) Climate Research Facility for working with data from a number of precipitation and cloud radars, but has been designed so that it can be used by others in the radar and atmospheric communities to examine, processes, and analyse data from many types of weather radars.


A library which follows the Python/C API as closely as possible, while providing equivalent functionality for objective caml. This is built against python 2.x and Ocaml 3.04.

It is intended to allow users to build native ocaml libraries and use them from python, and alternately, in order to allow ocaml users to benefit from linkable libraries provided for python.


pyDatalog adds the logic programming paradigm to Python’s extensive toolbox, in a pythonic way.

Logic programmers can now use the extensive standard library of Python, and Python programmers can now express complex algorithms quickly.

Datalog is a truly declarative language derived from Prolog, with strong academic foundations. Datalog excels at managing complexity. Datalog programs are shorter than their Python equivalent, and Datalog statements can be specified in any order, as simply as formula in a spreadsheet.


PyDom is a Python package which implements various diagnostics for NEMO model output.


PyFR is an open-source Python based framework for solving advection-diffusion type problems on streaming architectures using the Flux Reconstruction approach of Huynh. The framework is designed to solve a range of governing systems on mixed unstructured grids containing various element types. It is also designed to target a range of hardware platforms via use of an in-built domain specific language derived from the Mako templating engine.


A cross-platform windowing and multimedia library for Python. Pyglet provides an object-oriented programming interface for developing games and other visually-rich applications for Windows, Mac OS X and Linux. Features include:

  • No external dependencies or installation requirements. For most application and game requirements, pyglet needs nothing else besides Python, simplifying distribution and installation.

  • Take advantage of multiple windows and multi-monitor desktops. pyglet allows you to use as many windows as you need, and is fully aware of multi-monitor setups for use with fullscreen games.

  • Load images, sound, music and video in almost any format. pyglet can optionally use AVbin to play back audio formats such as MP3, OGG/Vorbis and WMA, and video formats such as DivX, MPEG-2, H.264, WMV and Xvid.


PyGeoIf provides a GeoJSON-like protocol for geo-spatial (GIS) vector data. When you want to write your own geospatilal library with support for this protocol you may use pygeoif as a starting point and build your functionality on top of it.

You may think of pygeoif as a shapely ultralight which lets you construct geometries and perform very basic operations like reading and writing geometries from/to WKT, constructing line strings out of points, polygons from linear rings, multi polygons from polygons, etc. It was inspired by shapely and implements the geometries in a way that when you are familiar with shapely you feel right at home with pygeoif. It was written to provide clean and python only geometries for fastkml.


Utilities for applying scikit-learn to spatial datasets.


The Kepler archive contains time-series data that have been calibrated and reduced from detector pixels. This pipelined reduction includes the removal of time-series trends systematic to the spacecraft and its environment rather than the targets. For every target there is a level of subjectivity required to reduce systematics. Differing scientific goals are likely to have differing requirements for systematic mitigation. Systematic reduction in the Kepler pipeline is optimized to yield the highest number of potentially-detectable exoplanet transits from a sample of 200,000 stars. PyKE, on the other hand, is a group of python tasks developed for the reduction and analysis of Kepler pixel-level data and Simple Aperture Photometry (SAP) data of individual targets with individual characteristics. PyKE was developed to provide alternative data reduction, tunable to the user’s specific science goals. The main purposes of these tasks are to i) re-extract light curves from manually-chosen pixel apertures and ii) cotrend and/or detrend the data in order to reduce or remove systematic noise structure using methods tun-able to user and target-specific requirements. Tasks to perform data analysis developed for the author’s science programs are also included. PyKE is an open source project. Contributions of new tasks or enhanced functionality of existing tasks by the community are welcome.

PyKE is a python-based PyRAF package which can also be executed without PyRAF on the command line of a shell.


pyKML is a Python package for creating, parsing, manipulating, and validating KML, a language for encoding and annotating geographic data.

pyKML is based on the lxml.objectify API which provides a Pythonic API for working with XML documents. pyKML adds additional functionality specific to the KML language.

KML comes in several flavors. pyKML can be used with KML documents that follow the base OGC KML specification, the Google Extensions Namespace, or a user-supplied extension to the base KML specification (defined by an XML Schema document).


A machine learning research library based on Theano.


Pynamic is a benchmark designed to test a system’s ability to handle the Dynamic Linking and Loading requirements of Python-based scientific applications. We developed this benchmark to represent a newly emerging class of DLL behaviors. Pynamic builds on pyMPI, an MPI extension to Python. Our augmentation includes a code generator that automatically generates Python C-extension dummy codes and a glue layer that facilitates linking and loading of the generated dynamic modules into the resulting pyMPI. Pynamic is configurable, enabling it to model the static properties of a specific code. It does not, however, model any significant computations of the target and hence it is not subjected to the same level of control as the target code. In fact, we encourage HPC computer vendors and tool developers to add it to their test suites. This benchmark provides an effective test of the compiler, the linker, the loader, the OS kernel and other runtime systems of a high performance computing (HPC) system to handle an important aspect of modern scientific computing applications. In addition, the benchmark serves as a stress test case for code development tools. Although Python has recently gained popularity in the HPC community, its heavy use of DLL operations has hindered certain HPC code development tools, notably parallel debuggers, from performing optimally.

The heart of Pynamic is a Python script that generates C files and compiles them into shared object libraries. Each library contains a Python callable entry function as well as a number of utility functions. The user can also enable cross library function calls with a command line argument. The Pynamic configure script then links these libraries into the pynamic-pyMPI executable and creates a driver script to exercise the functions in the generated libraries. The user can specify the number of libraries to create, as well as the average number of utility functions per library, thus tailoring the benchmark to match some application of interest. Pynamic introduces randomness in the number of functions per module and the function signatures, thus ensuring some heterogeneity of the libraries and functions.


A Python remote procedure call framework that uses JSON RPC v2.0. Python-JRPC allows programmers to create powerful client/server programs with very little code.

the FastHCS algorithm, we carry out an extensive simulation study and four real data applications, the results of which show that FastHCS is systematically more robust to outliers than its competitors.


A Python to C++ compiler for a subset of the Python language. It takes a python module annotated with a few interface description and turns it into a native python module with the same interface, but (hopefully) faster. It is meant to efficiently compile scientific programs, and takes advantage of multi-cores and SIMD instruction units.


A cross-platform free and open-source desktop geographic information system (GIS) application that provides data viewing, editing, and analysis capabilities. Similar to other software GIS systems QGIS allows users to create maps with many layers using different map projections. Maps can be assembled in different formats and for different uses. QGIS allows maps to be composed of raster or vector layers. Typical for this kind of software the vector data is stored as either point, line, or polygon-feature. Different kinds of raster images are supported and the software can perform georeferencing of images.

QGIS provides integration with other open source GIS packages, including PostGIS, GRASS, and MapServer to give users extensive functionality.[2] Plugins, written in Python or C++, extend the capabilities of QGIS. There are plugins to geocode using the Google Geocoding API, perform geoprocessing (fTools) similar to the standard tools found in ArcGIS, interface with PostgreSQL/PostGIS, SpatiaLite and MySQL databases.


Python bindings for QGIS that depend on SIP and PyQt4.

R Language


R package implementing multitaper spectral estimation techniques used in time series analysis. This version may be slightly more updated than the one on CRAN.


This package provides a framework to perform Non-negative Matrix Factorization (NMF). It implements a set of already published algorithms and seeding methods, and provides a framework to test, develop and plug new/custom algorithms. Most of the built-in algorithms have been optimized in C++, and the main interface function provides an easy way of performing parallel computations on multicore machines.


Supports the analysis of Oceanographic data, including ADP measurements, CTD measurements, sectional data, sea-level time series, coastline files, etc. Provides functions for calculating seawater properties such as potential temperature and density, as well as derived properties such as buoyancy frequency and dynamic height.


Functions for transforming and viewing 2-D and 3-D (oceanographic) data and model output.


OpenCPU is a system for embedded scientific computing and reproducible research. The OpenCPU server provides a reliable and interoperable HTTP API for data analysis based on R. You can either use the public servers or host your own. The OpenCPU JavaScript client library provides the most seamless integration of R and JavaScript available today. Enjoy simple RPC and data I/O through standard Ajax techniques. No need to learn crazy widgets or obscure framworks. The OpenCPU API is a clean and simple interface to R, nothing more nothing less. It is compatible with any language or framework that speaks HTTP.


A suite of functions for converting sp-class objects into KML or KMZ documents for use in Google Earth. Visualization of spatial and spatio-temporal objects in Google Earth


A multiway method to decompose a tensor (array) of any order, as a generalisation of SVD also supporting non-identity metrics and penalisations. 2-way SVD with these extensions is also available. The package includes also some other multiway methods: PCAn (Tucker-n) and PARAFAC/CANDECOMP with these extensions.


This package builds on the EMD package to provide additional tools for empirical mode decomposition (EMD) and Hilbert spectral analysis. It also implements the ensemble empirical decomposition (EEMD) and the complete ensemble empirical mode decomposition (CEEMD) methods to avoid mode mixing and intermittency problems found in EMD analysis. The package comes with several plotting methods that can be used to view intrinsic mode functions, the HHT spectrum, and the Fourier spectrum. To see the version history and download the bleeding-edge version (at your own risk!), see the project website at below. See the other links for PDF files describing numerical and exact analytical methods for determining instantaneous frequency, some examples of signals processed with this package, and some examples of the ensemble empirical mode decomposition method.


An R interface for C library libeemd for performing the ensemble empirical mode decomposition (EEMD), its complete variant (CEEMDAN) or the regular empirical mode decomposition (EMD).


Rserve is a TCP/IP server which allows other programs to use facilities of R (see from various languages without the need to initialize R or link against R library. Every connection has a separate workspace and working directory. Client-side implementations are available for popular languages such as C/C++, PHP and Java. Rserve supports remote connection, authentication and file transfer. Typical use is to integrate R backend for computation of statstical models, plots etc. in other applications.


Shiny makes it incredibly easy to build interactive web applications with R. Automatic "reactive" binding between inputs and outputs and extensive pre-built widgets make it possible to build beautiful, responsive, and powerful applications with minimal effort.


Discrete Prolate Spheroidal Sequence (Slepian) Regression Smoothers.


Package for discrete Morse-Smale complex approximation based on kNN graph. The Morse-Smale complex provides a decomposition of the domain. This package provides methods to compute a hierarchical sequence of Morse-Smale complicies and tools that exploit this domain decomposition for regression and visualization of scalar functions.


This package provides regularized principal component analysis incorporating smoothness, sparseness and orthogonality of eigenfunctions by using alternating direction method of multipliers (ADMM) algorithm.


This package contains a set of measures of dissimilarity between time series to perform time series clustering. Metrics based on raw data, on generating models and on the forecast behavior are implemented. Some additional utilities related to time series clustering are also provided, such as clustering algorithms and cluster evaluation metrics.


The W2CWM2C package is a set of functions to produce new graphical tools for wavelet correlation (bivariate and multivariate cases) using some routines from the waveslim and wavemulcor packages.


Wavelet analysis and reconstruction of time series, cross-wavelets and phase-difference (with filtering options), significance with simulation algorithms.


Basic wavelet routines for time series (1D), image (2D) and array (3D) analysis. The code provided here is based on wavelet methodology developed in Percival and Walden (2000); Gencay, Selcuk and Whitcher (2001); the dual-tree complex wavelet transform (DTCWT) from Kingsbury (1999, 2001) as implemented by Selesnick; and Hilbert wavelet pairs (Selesnick 2001, 2002).


Methods to calculate and interpret climate change signals and time series from climate multi-model ensembles. Climate model output in binary NetCDF format is read in and aggregated over a specified region to a data.frame for statistical analysis. Global circulation models (GCMs), as the CMIP5 or CMIP3 simulations, can be read in the same way as Regional Climate Models (RCMs), as e.g. the CORDEX or ENSEMBLES simulations.


Simulation and Inference for Stochastic Differential Equations. The YUIMA Project is an open source and collaborative effort aimed at developing the R package yuima for simulation and inference of stochastic differential equations. In the yuima package stochastic differential equations can be of very abstract type, multidimensional, driven by Wiener process or fractional Brownian motion with general Hurst parameter, with or without jumps specified as Lévy noise. The yuima package is intended to offer the basic infrastructure on which complex models and inference procedures can be built on. This paper explains the design of the yuima package and provides some examples of applications.


The Rapid Python Deep Learning Infrastructure (RaPyDLI) project is based on the objective to combine high level Python, C/C++ and Java environments with carefully designed libraries supporting GPU accelerators and MIC coprocessors. Interactive analysis and visualization will be supported together with scaling from the current terabyte size to Petabyte datasets to enable substantial progress in the complexity and capability of the DL applications. A broad range of storage models will be supported including network file systems, databases and HDFS. The partnership of Indiana University, University of Tennessee Knoxville, and Stanford University combines leaders in parallel computing algorithms and run times, Big Data, clouds, and deep learning.


Array Databases allow storing and querying massive multi-dimensional arrays, such as sensor, image, simulation, and statistics data appearing in domains like earth, space, and life science.

The rasdaman ("raster data manager") is the leading array analytics engine distinguished by its flexibility, performance, and scalability. Rasdaman embeds itself smoothly into PostgreSQL, but can also run standalone on file systems. In fact, rasdaman has pioneered Array Databases being the first fully implemented, operationally used system with an array query language and optimized processing engine; known rasdaman databases exceed 230 TB.


The petascope component of rasdaman implements the OGC interface standards WCS 2.0, WCS-T 1.4, WCPS 1.0, WPS 1.0, and WMS 1.1. For this purpose, petascope maintains its additional metadata (such as georeferencing) which is kept in separate relational tables. Note that not all rasdaman raster objects and collections are available through petascope by default; rather, they need to be registered through the petascope administration interface.

Petascope is implemented as a war file of servlets which give access to coverages (in the OGC sense) stored in rasdaman. Internally, incoming requests requiring coverage evaluation are translated into rasql queries by petascope. These queries are passed on to rasdaman, which constitutes the central workhorse. Results returned from rasdaman are forwarded to the client, finally.


Clean and fast and geospatial raster I/O for Python programmers who use Numpy. Rasterio employs GDAL under the hood for file I/O and raster formatting. Its functions typically accept and return Numpy ndarrays. Rasterio is designed to make working with geospatial raster data more productive and more fun.

Raster Numpy Basics

IPython notebook tutorial on nbviewer.


A MATLAB implementation of the RBF-QR method for radial basis function interpolation in the small shape parameter range.


Recki-CT is a set of tools that implement a PHP compiler, in PHP. It doesn’t provide a VM, so it can’t run PHP by itself. However, it can parse PHP code and generate other code from it. Recki uses the well-known PHP-Parser library to generate a graph-based representation of the code, and convert it to an intermediate representation. This intermediate form is pretty low-level, and it is comparatively simple to generate code from it for a variety of targets. One of the targets Recki can use is a second component, JitFu, which is a PHP extension allowing us to generate machine code at run time.


JIT-Fu is a PHP extension that exposes an OO API for the creation of native instructions to PHP userland, using libjit.


LibJIT is a library that provides generic Just-In-Time compiler functionality independent of any particular bytecode, language, or runtime. The goal of the libjit project is to provide an extensive set of routines that takes care of the bulk of the JIT process, without tying the programmer down with language specifics. Where we provide support for common object models, we do so strictly in add-on libraries, not as part of the core code.


Redis is an open source, BSD licensed, advanced key-value cache and store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets, sorted sets, bitmaps and hyperloglogs.


RLaB is an interactive, interpreted scientific programming environment for fast numerical prototyping and program development. rlabplus provides the third release of the environment for 32- and 64-bit linux systems on Intel and ARM/RaspberryPi architectures. The environment integrates large number of numerical solvers and functions from various sources, most notably from the Gnu Scientific Library (GSL) and from the netlib. Within the environment it is possible to visualize data using gnuplot, xmgrace, and pgplot xor plplot; get and post data using uniform resource locator implementing HDF5 or world wide web; and control serial, GPIB or TCP/IP connection. RLaB supports embedded python, java and ngspice interpreters. RLaB was created by Ian Searle and collaborators. rlabplus is being actively developed by Marijan Kostrun.


A vision of heterogeneous computer systems that incorporate diverse accelerators and automatically select the best computational unit for a particular task is widely shared among researchers and many industry analysts; however, there are no agreed-upon benchmarks to support the research needed in the development of such a platform. There are many suites for parallel computing on general-purpose CPU architectures, but accelerators fall into a gap that is not covered by previous benchmark development. Rodinia is released to address this concern.[]


The Robot Operating System (ROS) is a flexible framework for writing robot software. It is a collection of tools, libraries, and conventions that aim to simplify the task of creating complex and robust robot behavior across a wide variety of robotic platforms.


This professional scientific software computes recurrence plots, cross recurrence plots, joint recurrence plots and recurrence quantification analysis on commandline of Unix and DOS/DOS-emulated systems. It is able to work with really long data series. However, the output of the results (plots) have to be prepared with external programmes (e.g. gnuplot or Matlab).

The state space trajectory can be reconstructed from single time-series by time-delay embedding. Alternatively, the columns of input data can be used as the components of the state space vectors.


RPerl is an upgrade to the popular Perl 5 programming language. RPerl gives software developers a compiler to make their apps run really fast on parallel computing platforms like multi-core processors, the cloud, clusters, and supercomputers. RPerl stands for Restricted Perl, in that we restrict our use of Perl to those parts which can be made to run fast.

The input to the RPerl compiler is low-magic Perl 5 source code. RPerl converts the low-magic Perl 5 source code into C source code using Perl and/or C data structures. Inline::CPP converts the C source code into XS source code. Perl’s XS tools and a standard C compiler convert the XS source code into machine-readable binary code, which can be directly linked back into normal high-magic Perl 5 source code.

The output of the RPerl compiler is fast-running binary code that is exactly equivalent to, and compatible with, the original low-magic Perl 5 source code input. The net effect is that RPerl compiles slow low-magic Perl 5 code into fast binary code, which can optionally be mixed back into high-magic Perl apps.


This document describes an implementation in C of a set of randomized algorithms for computing partial Singular Value Decompositions (SVDs). The techniques largely follow the prescriptions in the article "Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions," N. Halko, P.G. Martinsson, J. Tropp, SIAM Review, 53(2), 2011, pp. 217-288, but with some modifications to improve performance. The codes implement a number of low rank SVD computing routines for three different sets of hardware: (1) single core CPU, (2) multi core CPU, and (3) massively multicore GPU.


RTL-SDR is a very cheap software defined radio that uses a DVB-T TV tuner dongle based on the RTL2832U chipset. With the combined efforts of Antti Palosaari, Eric Fry and Osmocom it was found that the signal I/Q data could be accessed directly, which allowed the DVB-T TV tuner to be converted into a wideband software defined radio via a new software driver.

Essentially, this means that a cheap $20 TV tuner USB dongle with the RTL2832U chip can be used as a computer based radio scanner. This sort of scanner capability would have cost hundreds or even thousands just a few years ago. The RTL-SDR is also often referred to as RTL2832U, DVB-T SDR, or the “$20 Software Defined Radio”.


A new cross platform SDR receiver which is based on the liquid-dsp libraries.


OsmoSDR is a small form-factor inexpensive SDR (Software Defined Radio) project. If you are familiar with existing SDR receivers, then OsmoSDR can be thought of something in between a FunCube Dongle (only 96kHz bandwidth) and a USRP (much more expensive). For a very cheap SDR (with limited dynamic range), you can use the DVB-T USB stick using the RTL2832U chip, as documented in rtl-sdr. It consists of a USB-attached Hardware, associated Firmware as well as GrOsmoSDR gnuradio integration on the PC.

The gr-osmosdr software is a GNU Radio block to work with OsmoSDR and rtl-sdr, although it also supports at least a dozen other types of hardware.


A digital signal processing (DSP) library designed specifically for software-defined radios on embedded platforms. The aim is to provide a lightweight DSP library that does not rely on a myriad of external dependencies or proprietary and otherwise cumbersome frameworks. All signal processing elements are designed to be flexible, scalable, and dynamic, including filters, filter design, oscillators, modems, synchronizers, and complex mathematical operations.


Software to turn the RTL2832U into a SDR. Much software is available for the RTL2832. Most of the user-level packages rely on the librtlsdr library which comes as part of the rtl-sdr codebase. This codebase contains both the library itself and also a number of command line tools such as rtl_test, rtl_sdr, rtl_tcp, and rtl_fm. These command line tools use the library to test for the existence of RTL2832 devices and to perform basic data transfer functions to and from the device.

At the user level, there are several options for interacting with the hardware. The rtl-sdr codebase contains a basic FM receiver program that operates from the command line. The rtl_fm program is a command line tool that can initialize the RTL2832, tune to a given frequency, and output the received audio to a file or pipe the output to command line audio players such as the alsa aplay or the sox play commands. There is also the rtl_sdr program that will output the raw I-Q data to a file for more basic analysis.

If you want to do more advanced experiments, the GNU Radio collection of tools can be used to build custom radio devices. GNU Radio can be used both from a GUI perspective in which you can drag-and-drop radio components to build a radio and also programmatically where software programs written in C or Python are created that directly reference the internal GNU Radio functions.

The use of GNU Radio is attractive because of the large number of pre-built functions that can easily be connected together. However, be aware that this is a large body of software with dependencies on many libraries. Thankfully there is a simple script that will perform the installation but still, the time required can be on the order of hours. When starting out, it might be good to try the command line programs that come with the rtl-sdr package first and then install the GNU Radio system later.




Prawn is a nimble PDF writer for Ruby. More important, it’s a hackable platform that offers both high level APIs for the most common needs and low level APIs for bending the document model to accomodate special circumstances.

With Prawn, you can write text, draw lines and shapes and place images anywhere on the page and add as much color as you like. In addition, it brings a fluent API and aggressive code re-use to the printable document space.


Reactive extensions to Python.

The Reactive Extensions for Python (RxPY) is a set of libraries for composing asynchronous and event-based programs using observable sequences and LINQ-style query operators in Python. Using Rx, developers represent asynchronous data streams with Observables, query asynchronous data streams using LINQ operators, and parameterize the concurrency in the asynchronous data streams using Schedulers. Simply put, Rx = Observables + LINQ


SageManifolds is a package under development for the modern computer algebra system Sage, implementing differential geometry and tensor calculus.

SageManifolds deals with real differentiable manifolds of arbitrary dimension. The basic objects are tensor fields and not tensor components in a given vector frame or coordinate chart. In other words, various charts and frames can be introduced on the manifold and a given tensor field can have representations in each of them.

An important class of treated manifolds is that of pseudo-Riemannian manifolds, among which Riemannian manifolds and Lorentzian manifolds, with applications to General Relativity. In particular, SageManifolds implements the computation of the Riemann curvature tensor and associated objects (Ricci tensor, Weyl tensor). SageManifolds can also deal with generic affine connections, not necessarily Levi-Civita ones.

The SageManifolds project aims at extending the mathematics software system Sage towards differential geometry and tensor calculus. Like Sage, SageManifolds is free, open-source and is based on the Python programming language. We discuss here some details of the implementation, which relies on Sage’s parent/element framework, and present a concrete example of use.


Sailfish is an open source (LGPL) fluid dynamics solver based on the lattice Boltzmann method (LBM). It is uses run-time code generation techniques to automatically generate optimized, simulation specific code for GPU devices (both CUDA and OpenCL targets are supported). Documentation


The Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory is developing algorithms and software technology to enable the application of structured adaptive mesh refinement (SAMR) to large-scale multi-physics problems relevant to U.S. Department of Energy programs.

SAMRAI (Structured Adaptive Mesh Refinement Application Infrastructure) is an object-oriented C++ software library enables exploration of numerical, algorithmic, parallel computing, and software issues associated with applying structured adaptive mesh refinement (SAMR) technology in large-scale parallel application development. SAMRAI provides software tools for developing SAMR applications that involve coupled physics models, sophisticated numerical solution methods, and which require high-performance parallel computing hardware. SAMRAI enables integration of SAMR technology into existing codes and simplifies the exploration of SAMR methods in new application domains. Due to judicious application of object-oriented design, SAMRAI capabilities are readily enhanced and extended to meet specific problem requirements. The SAMRAI team collaborates with application researchers at LLNL and other institutions. These interactions motivate the continued evolution of the SAMRAI library.


An object-functional programming language for general software applications. Scala has full support for functional programming and a very strong static type system. This allows programs written in Scala to be very concise and thus smaller in size than other general-purpose programming languages. Many of Scala’s design decisions were inspired by criticism over the shortcomings of Java.

Scala source code is intended to be compiled to Java bytecode, so that the resulting executable code runs on a Java virtual machine. Java libraries may be used directly in Scala code and vice versa (Language interoperability).[8] Like Java, Scala is object-oriented, and uses a curly-brace syntax reminiscent of the C programming language. Unlike Java, Scala has many features of functional programming languages like Scheme, Standard ML and Haskell, including currying, type inference, immutability, lazy evaluation, and pattern matching. It also has an advanced type system supporting algebraic data types, covariance and contravariance, higher-order types, and anonymous types. Other features of Scala not present in Java include operator overloading, optional parameters, named parameters, raw strings, and no checked exceptions.


Saddle is a data manipulation library for Scala that provides array-backed, indexed, one- and two-dimensional data structures that are judiciously specialized on JVM primitives to avoid the overhead of boxing and unboxing.

Saddle offers vectorized numerical calculations, automatic alignment of data along indices, robustness to missing (N/A) values, and facilities for I/O.

Saddle draws inspiration from several sources, among them the R programming language & statistical environment, the numpy and pandas Python libraries, and the Scala collections library.


A Scala to JavaScript compiler. Scala.js compiles Scala code to JavaScript, allowing you to write your web application entirely in Scala.


The ScalaLab project aims to provide an efficient scientific programming environment for the Java Virtual Machine. The scripting language is based on the Scala programming language enhanced with high level scientific operators and with an integrated environment that provides a MATLAB-like working style. Also, all the huge libraries of Java scientific code can be easily accessible (and many times with a more convenient syntax). The main potential of the ScalaLab is numerical code speed and flexibility. The statically typed Scala language can provide speeds of scripting code similar to pure Java. A major design priority of ScalaLab is its user-friendly interface. We like the user to enjoy writing scientific code, and with this objective we design the whole framework.


A suite of machine learning and numerical computing libraries. ScalaNLP is the umbrella project for several libraries, including Breeze and Epic. Breeze is a set of libraries for machine learning and numerical computing. Epic is a high-performance statistical parser and structured prediction library.


ScMathML is a Scala library for executing Content MathML. Content MathML is a move towards a standard, open format for representing mathematics with relatively well defined semantics. ScMathML takes formulas, and evaluates them in a Context, which provides access to domain objects, constants etc.


A Scala project that harvests sensor data from web sources. The data is then pushed to an SOS using the sos-injection module project. SosInjector is a project that wraps an Sensor Observation Service (SOS). The sos-injection module provides Java classes to enter stations, sensors, and observations into an SOS.

sensor-web-harvester is used to fill an SOS with observations from many well-known sensor sources (such as NOAA and NERRS). This project pulls sensor observation values from the source’s stations. It then formats the data to be placed into the user’s SOS by using the sos-injector. The source stations used are filtered by a chosen bounding box area.


Schur is a stand alone C program for interactively calculating properties of Lie groups and symmetric functions. Schur has been designed to answer questions of relevance to a wide range of problems of special interest to chemists, mathematicians and physicists - particularly for persons who need specific knowledge relating to some aspect of Lie groups or symmetric functions and yet do not wish to be encumbered with complex algorithms. The objective of Schur is to supply results with the complexity of the algorithms hidden from view so that the user can effectively use Schur as a scratch pad, obtaining a result and then using that result to derive new results in a fully interactive manner. Schur can be used as a tool for calculating branching rules, Kronecker products, Casimir invariants, dimensions, plethysms, S-function operations, Young diagrams and their hook lengths etc.

As well as being a research tool Schur forms an excellent tool for helping students to independently explore the properties of Lie groups and symmetric functions and to test their understanding by creating simple examples and moving on to more complex examples. The user has at his or her disposal over 160 commands which may be nested to give a vast variety of potential operations. Every command, with examples, is described in a 200 page manual. Attention has been given to input/output issues to simplify input and to give a well organized output. The output may be obtained in TeX form if desired. Log files may be created for subsequent editing. On line help files may be brought to screen at any time.


ScientiFig is a free tool to help you create, format or reformat scientific figures.


The SciRuby Project aims to provide Ruby with scientific capabilities similar to what the wonderful NumPy and SciPy libraries bring to Python. Our goal is to provide a complete suite of statistical, numerical, and visualization software tools for scientific computing.


SCIRun is a problem solving environment or "computational workbench" in which a user selects software modules that can be connected in a visual programing environment to create a high level workflow for experimentation. Each module exposes all the available parameters necessary for scientists to adjust the outcome of their simulation or visualization. The networks in SCIRun are flexible enough to enable duplication of networks and creation of new modules.

Many SCIRun users find this software particularly useful for their bioelectric field research. Their topics of investigation include cardiac electro-mechanical simulation, ECG and EEG forward and inverse calculations, modeling of deep brain stimulation, electromyography calculation, and determination of the electrical conductivity of anisotropic heart tissue. Users have also made use of SCIRun for the visualization of breast tumor brachytherapy, computer aided surgery, teaching, and a number of non-biomedical applications.

SciTools Github


A Python WMS service for geospatial gridded data (Only triangular unstructured meshes and logically rectangular grids officially supported at this time).


My First 5 Minutes On A Server; Or, Essential Security for Linux Servers


EncFS provides an encrypted filesystem in user-space. It runs with regular user permissions using the FUSE library.


Fail2ban scans log files (e.g. /var/log/apache/error_log) and bans IPs that show the malicious signs — too many password failures, seeking for exploits, etc. Generally Fail2Ban is then used to update firewall rules to reject the IP addresses for a specified amount of time, although any arbitrary other action (e.g. sending an email) could also be configured. Out of the box Fail2Ban comes with filters for various services (apache, courier, ssh, etc).

Fail2Ban is able to reduce the rate of incorrect authentications attempts however it cannot eliminate the risk that weak authentication presents. Configure services to use only two factor or public/private authentication mechanisms if you really want to protect services.


In high-end computing environments, remote file transfers of very large data sets to and from computational resources are commonplace as users are typically widely distributed across different organizations and must transfer in data to be processed and transfer out results for further analysis. Local transfers of this same data across file systems are also frequently performed by administrators to optimize resource utilization when new file systems come on-line or storage becomes imbalanced between existing file systems. In both cases, files must traverse many components on their journey from source to destination where there are numerous opportunities for performance optimization as well as failure. A number of tools exist for providing reliable and/or high performance file transfer capabilities, but most either do not support local transfers, require specific security models and/or transport applications, are difficult for individual users to deploy, and/or are not fully optimized for highest performance.

Shift is a framework for Self-Healing Independent File Transfer that provides high performance and resilience for local and remote transfers through a variety of techniques. These include end-to-end integrity via cryptographic hashes, throttling of transfers to prevent resource exhaustion, balancing transfers across resources based on load and availability, and parallelization of transfers across multiple source and destination hosts for increased redundancy and performance. In addition, Shift was specifically designed to accommodate the diverse heterogeneous environments of a widespread user base with minimal assumptions about operating environments. In particular, Shift is unique in its ability to provide advanced reliability and automatic single and multi-file parallelization to any stock command-line transfer application while being easily deployed by both individual users as well as entire organizations.


The Scalable HeterOgeneous Computing (SHOC) benchmark suite is a collection of benchmark programs testing the performance and stability of systems using computing devices with non-traditional architectures for general purpose computing. Its initial focus is on systems containing Graphics Processing Units (GPUs) and multi-core processors, and on the OpenCL programming standard. It can be used on clusters as well as individual hosts.


The Super Instruction Architecture (SIA) is an environment comprising a programming language, SIAL, and a runtime system, SIP, with the goal of providing portable and efficient code related to tensor computations for a wide array of computing environments including distributed-memory environments [20]. SIAL exposes commonly used abstractions in scientific computing, such as blocking, providing the user a useful method of describing how an algorithm proceeds without unnecessarily complicating the code. Programs written in SIAL are compiled to a bytecode which is then interpreted by an SIP virtual machine which handles the execution of the program. Additionally, the SIP handles difficulties associated with parallelism, thus hiding this aspect of the program from the user. Distributed-memory parallelism in the SIP is handled through the use of asynchronous communication routines to aid in effectively overlapping computation with communication.


This project is a SimTK toolset providing general multibody dynamics capability, that is, the ability to solve Newton’s 2nd law F=ma in any set of generalized coordinates subject to arbitrary constraints. (That’s Isaac himself in the oval.) Simbody is provided as an open source, object-oriented C++ API and delivers high-performance, accuracy-controlled science/engineering-quality results.

Simbody uses an advanced Featherstone-style formulation of rigid body mechanics to provide results in Order(n) time for any set of n generalized coordinates. This can be used for internal coordinate modeling of molecules, or for coarse-grained models based on larger chunks. It is also useful for large-scale mechanical models, such as neuromuscular models of human gait, robotics, avatars, and animation. Simbody can also be used in real time interactive applications for biosimulation as well as for virtual worlds and games.

This toolset was developed originally by Michael Sherman at the Simbios Center at Stanford, with major contributions from Peter Eastman and others. Simbody descends directly from the public domain NIH Internal Variable Dynamics Module (IVM) facility for molecular dynamics developed and kindly provided by Charles Schwieters. IVM is in turn based on the spatial operator algebra of Rodriguez and Jain from NASA’s Jet Propulsion Laboratory (JPL), and Simbody has adopted that formulation.

See also PyCraft


SIMD.js is a new API being developed by Intel, Google, and Mozilla for JavaScript which introduces several new types and functions for doing SIMD computations. For example, the Float32x4 type represents 4 float32 values packed up together. The API contains functions to operate on those values together, including all the basic arithmetic operations, and operations to rearrange, load, and store such values. The intent is for browsers to implement this API directly, and provide optimized implementations that make use of SIMD instructions in the underlying hardware.

The SIMD.js API itself is in active development. The ecmascript_simd github repository is currently serving as a provision specification as well as providing a polyfill implementation to provide the functionality, though of course not the accelerated performance, of the SIMD API on existing browsers. It also includes some benchmarks which also serve as examples of basic SIMD.js usage.


Once you have generated a discrete problem you wish to translate this abstract formulation to specific code on a certain simulation platform. Simflowny is designed as an extensible framework on which plug-ins for different simulation platforms can be easily added. The current version provides support for the Cactus simulation framework and for the SAMRAI mesh management system. Both Cactus and SAMRAI provide parallelization by leveraging MPI-based communication between computers, which permits running simulations on clusters and taking advantage of multiple cores in modern chips.

Simflowny generates Fortran code for Cactus and C++ code for SAMRAI. It is also capable of compiling and linking a final binary that can be independently used as a simulation software. Alternatively, Simflowny also provides a GUI to manage simulations within the platform. Simulations may be launched locally, or remotely, by connecting to a Grid infrastructure.

Output both in Cactus and SAMRAI is mainly generated through HDF5 files, which contain snapshots from certain instants in the simulation. These results may be visualized with a number of commercial and free visualization tools.


SimGrid is a scientific instrument to study the behavior of large-scale distributed systems such as Grids, Clouds, HPC or P2P systems. It can be used to evaluate heuristics, prototype applications or even assess legacy MPI applications.


This project provides tools for postprocessing data on triangular grids (simplex cells), such as computing meridional and barotropic stream functions and several transports through user defined slices. The data are interpolated onto a regular grid of user defined mesh size, equidistant in each (horizontal) coordinate direction. Postprocessing takes place on this regular grid.


A web-based scientific application deployment and visualization framework for coastal modeling and beyond.


SINGE is a Python 3 code. It computes, for full spheres and spherical shells, inertial and inertia-gravito modes in the mantle frame of reference. Boussinesq, homegeneous and viscous fluids are taken into account, with various different boundary conditions (no slip / stress-free for the velocity field, constant heat flux / isothermal for the temperature).

It uses a parallel pseudo-spectral approach in spherical geometry. The velociy field is projetcted onto poloidal and toroidal scalars, which are expanded on spherical harmonics in the angular directions and finite differences on an irregular mesh in the radial direction.


A computer algebra system for polynomial computations, with special emphasis on commutative and non-commutative algebra, algebraic geometry, and singularity theory.

Its advanced algorithms, contained in currently more than 90 libraries, address topics such as absolute factorization, algebraic D-modules, classification of singularities, deformation theory, Gauss-Manin systems, Hamburger-Noether (Puiseux) development, invariant theory, (non-) commutative homological algebra, normalization, primary decomposition, resolution of singularities, and sheaf cohomology.

Further functionality is obtained by combining Singular with third-party software linked to SINGULAR. This includes tools for convex geometry, tropical geometry, and visualization.


Skeleton programming is an approach where an application is written with the help of "skeletons". A skeleton is a pre-defined, generic component such as map, reduce, scan, farm, pipeline etc. that implements a common specific pattern of computation and data dependence, and that can be customized with (sequential) user-defined code parameters. Skeletons provide a high degree of abstraction and portability with a quasi-sequential programming interface, as their implementations encapsulate all low-level and platform-specific details such as parallelization, synchronization, communication, memory management, accelerator usage and other optimizations.

SkePU poster SkePU is an open-source skeleton programming framework for multicore CPUs and multi-GPU systems. It is a C++ template library with six data-parallel and one task-parallel skeletons, two generic container types, and support for execution on multi-GPU systems both with CUDA and OpenCL.


SkyNet is an efficient and robust neural network training code for machine learning. It is able to train large and deep feed-forward neural networks, including autoencoders, for use in a wide range of supervised and unsupervised learning applications, such as regression, classification, density estimation, clustering and dimensionality reduction. SkyNet is implemented in C/C++ and fully parallelised using MPI.

BAMBI (Blind Accelerated Multimodal Bayesian Inference) is a Bayesian inference engine that combines the benefits of SkyNet with MultiNest. It operated by simulateneously performing Bayesian inference using MultiNest and learning the likelihood function using SkyNet. Once SkyNet has learnt the likelihood to sufficient accuracy, inference finishes almost instantaneously.


SLEPc is a software library for the solution of large scale sparse eigenvalue problems on parallel computers. It is an extension of PETSc and can be used for linear eigenvalue problems in either standard or generalized form, with real or complex arithmetic. It can also be used for computing a partial SVD of a large, sparse, rectangular matrix, and to solve nonlinear eigenvalue problems (polynomial or general). Additionally, SLEPc provides solvers for the computation of the action of a matrix function on a vector. SLEPc is based on the PETSc data structures and it employs the MPI standard for message-passing communication.


slepc4py are Python bindings for SLEPc, the Scalable Library for Eigenvalue Problem Computations.


Slicer, or 3D Slicer, is a free, open source software package for visualization and image analysis.


The Surface Water Modeling System (SMS) is a comprehensive graphical environment for one-, two-, and three-dimensional hydrodynamic modeling. A pre- and post-processor for surface water modeling and design, SMS includes 2D finite element, 2D finite difference, 3D visualization modeling tools, and limited 1D support. Supported models include the USACE-ERDC supported TABS-MD (GFGEN, RMA2, RMA4, SED2D-WES), ADCIRC, ADH, CGWAVE, CMS-Flow, CMS-Wave, STWAVE, and PTM models. Comprehensive interfaces have also been developed for facilitating the use of the FHWA commissioned analysis FESWMS package. SMS also includes a generic model interface, which can be used to support models which have not been officially incorporated into the system.

The numeric models supported in SMS compute a variety of information applicable to surface water modeling. Primary applications of the models include calculation of water surface elevations and flow velocities for shallow water flow problems, for both steady-state or dynamic conditions. Additional applications include the modeling of contaminant migration, salinity intrusion, sediment transport (scour and deposition), wave energy dispersion, wave properties (directions, magnitudes and amplitudes) and others.

The SMS interface is composed of various modules which streamline the modeling process: Scatter Data, Map conceptualization, GIS, particle tracking, annotation, and the new raster module.


SocketCluster is an open source, multi-process realtime environment written in JavaScript (Node.js). You can build entire applications on top of it or you can use it alongside existing systems written in other languages.

SC supports both direct client-server communication (like and group communication via pub/sub channels.

SC is designed to scale both vertically across multiple CPU cores and horizontally across multiple machines/instances (via pub/sub channel synchronization).


SOFA is an Open Source framework primarily targeted at real-time simulation, with an emphasis on medical simulation. It is mostly intended for the research community to help develop newer algorithms, but can also be used as an efficient prototyping tool.

The SOFA architecture relies on several innovative concepts, in particular the notion of multi-model representation. In SOFA, most simulation components (deformable models, collision models, instruments, …​) can have several representations, connected together through a mechanism called mapping. Each representation can then be optimized for a particular task (e.g. collision detection, visualization) while at the same time improving interoperability by creating a clear separation be tween the functional aspects of the simulation components. As a consequence, it is possible to have models of very different nature interact together, for instance rigid bodies, deformable objects, and fluids. At a finer level of granularity, we also propose a decomposition of physical models (i.e. any model that behaves according to the laws of physics) into a set of basic components. This decomposition leads for instance to a representation of mechanical models as a set of degrees of freedom and force fields acting on these degrees of freedom. Another key aspect of SOFA is the use of a scene-graph to organize and process the elements of a simulation while clearly separating the computation tasks from their possibly parallel scheduling.

Software Collections

Software Collections give you power to build, install, and use multiple versions of software on the same system, without affecting system-wide installed packages.


Somoclu is a massively parallel tool for training self-organizing maps on large data sets written in C++. It builds on OpenMP for multicore execution, and on MPI for distributing the workload across the nodes in a cluster. It is also able to boost training by using CUDA if graphics processing units are available. A sparse kernel is included, which is useful for high-dimensional but sparse data, such as the vector spaces common in text mining workflows. Python, R and MATLAB interfaces facilitate interactive use. Apart from fast execution, memory use is highly optimized, enabling training large emergent maps even on a single node.


A flexible package manager designed to support multiple versions, configurations, platforms, and compilers.

Spack is a package management tool designed to support multiple versions and configurations of software on a wide variety of platforms and environments. It was designed for large supercomputing centers, where many users and application teams share common installations of software on clusters with exotic architectures, using libraries that do not have a standard ABI. Spack is non-destructive: installing a new version does not break existing installations, so many configurations can coexist on the same system.


Apache Spark is an open-source cluster computing framework originally developed in the AMPLab at UC Berkeley. In contrast to Hadoop’s two-stage disk-based MapReduce paradigm, Spark’s in-memory primitives provide performance up to 100 times faster for certain applications.[1] By allowing user programs to load data into a cluster’s memory and query it repeatedly, Spark is well suited to machine learning algorithms.[2]

Spark requires a cluster manager and a distributed storage system. For cluster manager, Spark supports standalone (native Spark cluster), Hadoop YARN, or Apache Mesos.[3] For distributed storage, Spark can interface with a wide variety, including Hadoop Distributed File System (HDFS),[4] Cassandra,[5] OpenStack Swift, and Amazon S3. Spark also supports a pseudo-distributed mode, usually used only for development or testing purposes, where distributed storage is not required and the local file system can be used instead; in the scenario, Spark is running on a single machine with one worker per CPU core.


SPASM is a templatised C++ library for the storage and manipulation of a variety of probabilistic representations. Represenationtations currently considered are: Gaussian, Gaussian Mixtures, Parzen density Estimates, Particles and Discrete grids. this library will include:

  • Storage classes for PDF’s and likelihoods

  • Basic operators such as multiplication, division, convolution and addition.

  • Conversion between common types (e.g. Gaussian)

  • Information measures (such as entropy etc.)

  • Distance measures

  • Complexity Reduction (Gaussian Mixtures have a habit of growing after common operations)

Other libraries for Bayesian filtering exist such as Bayes++ and BFL . These libraries are focused more on filtering in general and not the methods used for storage and manipulation of a variety of representations.

This library has been motivated by robotics applications, such as feature tracking, SLAM and other forms of localisation. Numerous other applications exist that require probability representations, such a statistical learning.

spatial indexing


A C++ implementation of R*-tree, an MVR-tree and a TPR-tree with C API. The library was created to provide an extensible framework that will support robust spatial indexing methods, support for sophisticated spatial queries, enable easy to use interfaces for inserting, deleting and updating information, enable a wide variety of customization capabilities, and provide index persistence.


A ctypes Python wrapper of libspatialindex that provides a number of advanced spatial indexing features for the spatially curious Python user. The features include nearest neighbor search, intersection search, multi-dimensional indexes, clustered indexes, bulk loading, deletion, disk serialization, and custom storage implemenation.


A fast, accurate Python geohashing library. A string representation of two dimensional geometric coordinates. This technique is well described at Wikipedia Geohash. It is basically, a form of Z-order curve. Quadtree can be used to construct a string representation. In this library, 0 for SW, 1 for SE, 2 for NW and 3 for NE. This project will maintain fast, accurate geohash, quadhash and grid code python library.

Spherical Harmonics Manipulator

This software computes synthesis of spherical harmonics models on sparse coordinates or grids (provided in a geodetic or geocentric reference system). It exploits basic parallelism using openmp directives. A binary that requires the MCR library shown below.


Tool for estimation of statistical characteristics of multivariate random functions.

This paper examines the feasibility of high-level Python based utilities for numerically intensive applications via an example of a multidimensional integration for the evaluation of the statistical characteristics of a random variable. We discuss the approaches to the implementation of mathematically formulated incremental expressions using high-level scripting code and low-level compiled code. Due to the dynamic typing of the Python language, components of the algorithm can be easily coded in a generic way as algorithmic templates. Using the Enthought Development Suite they can be effectively assembled into a flexible computational framework that can be configured to execute the code for arbitrary combinations of integration schemes and versions of instantiated code. The paper describes the development cycle using a simple running example involving averaging of a random two-parametric function that includes discontinuity. This example is also used to compare the performance of the available algorithmic and executional features.


SQuadGen is a mesh generation utility that uses a cubed-sphere base mesh to generate quadrilateral meshes with user-specified enhancements. In order to determine where enhancement is desired, the user provides a PNG file which corresponds to a latitude-longitude grid. Raster values with higher brightness (whiter values) are tagged for refinement. The algorithm uses a basic paving technique and supports two paving stencil types: Low-connectivity (LOWCONN) and CUBIT-type transition regions.


A Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL. It provides a full suite of well known enterprise-level persistence patterns, designed for efficient and high-performing database access, adapted into a simple and Pythonic domain language.

SQLAlchemy considers the database to be a relational algebra engine, not just a collection of tables. Rows can be selected from not only tables but also joins and other select statements; any of these units can be composed into a larger structure. SQLAlchemy’s expression language builds on this concept from its core.

SQLAlchemy is most famous for its object-relational mapper (ORM), an optional component that provides the data mapper pattern, where classes can be mapped to the database in open ended, multiple ways - allowing the object model and database schema to develop in a cleanly decoupled way from the beginning.


GeoAlchemy is an extension of SQLAlchemy. It provides support for Geospatial data types at the ORM layer using SQLAlchemy. It aims to support spatial operations and relations specified by the Open Geospatial Consortium (OGC). GeoAlchemy supports PostGIS, SpatiaLite and MySQL.


SqueezePlug & Max2Play is the new combination of SqueezePlug, the famous Multiroom Audio Solution and Max2Play, the web based framework for controlling Linux based Mini-Computer like e.g. Raspberry Pi, Odroid and others by a simple web interface without any, absolutely any Linux Know-How. SqueezePlug is now a plug-in in Max2Play to make the configuration as easy as possible. There is no need for connecting a monitor, a keyboard or a mouse on the device itself. It all runs headless and no special tools like Putty are needed any more. The configuration is as simple as configuring a router from a web interface.

SqueezePlug & Max2Play is a Multiroom Audio Solution with server and player components. Multiroom Audio Solution means, that you have one server and as much players as you like. Every player uses the shared music from the server. Players can be synced to play the same music, or they can play different music. A Mini-Computer can be a server, a player or a server and a player together. The files can resist on the Mini-Computer itself e.g. on an directly attached USB-HD or on another location on your network. A Mini-Computer doesn’t make any noise, cause it don’t have any cooling components and consumes very little power.



A Python 2.5-2.7 library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks. It provides a basic suite of operations for executing local or remote shell commands (normally or via sudo) and uploading/downloading files, as well as auxiliary functionality such as prompting the running user for input, or aborting execution. Typical use involves creating a Python module containing one or more functions, then executing them via the fab command-line tool.


A Python (2.6+, 3.3+) implementation of the SSHv2 protocol [1], providing both client and server functionality. While it leverages a Python C extension for low level cryptography (PyCrypto), Paramiko itself is a pure Python interface around SSH networking concepts.


A probabilistic programming language for Bayesian inference written in C++.[1] The Stan language is used to specify a Bayesian statistical model, which is an imperative declaration of the log probability density function. It has interfaces for R and Python as well as a command-line interface.


STELLA is a strongly typed, object-oriented, Lisp-like language, designed to facilitate symbolic programming tasks in artificial intelligence applications. STELLA preserves those features of Common Lisp deemed essential for symbolic programming such as built-in support for dynamic data structures, heterogeneous collections, first-class symbols, powerful iteration constructs, name spaces, an object-oriented type system with a meta-object protocol, exception handling, and language extensibility through macros, but without compromising execution speed, interoperability with non-STELLA programs, and platform independence. STELLA programs are translated into a target language such as C++, Common Lisp, or Java, and then compiled with the native target language compiler to generate executable code. The language constructs of STELLA are restricted to those that can be translated directly into native constructs of the intended target languages, thus enabling the generation of highly efficient as well as readable code.


Stetl (Streaming ETL) is a lightweight, geospatial ETL framework for the conversion of rich (e.g. GML) geospatial data. Stetl uses existing transformation tools like GDAL/OGR and XSLT and is glued through Python. A config file specifies an ETL chain of modules. Stetl is speed-optimized by using native calls like ogr2ogr, libxml and libxslt (via lxml).


StocPy is an expressive probabilistic programming language, provided as a Python library.

We introduce the first, general purpose, slice sampling inference engine for probabilistic programs. This engine is released as part of StocPy, a new Turing-Complete probabilistic programming language, available as a Python library. We present a transdimensional generalisation of slice sampling which is necessary for the inference engine to work on traces with different numbers of random variables. We show that StocPy compares favourably to other PPLs in terms of flexibility and usability, and that slice sampling can outperform previously introduced inference methods. Our experiments include a logistic regression, HMM, and Bayesian Neural Net.


Streamline Version 4 is a versatile Fortran 77 & C++ program for calculating charged test particle trajectories or field-lines for user-specified fields using the test-particle method. The user has the freedom to specify any type of field (analytical, tabulated in files, time dependent, etc.) and maintains complete control over initial conditions of trajectories/field-lines and boundary conditions of specified fields. The structure of Streamline was redesigned from previous versions in order to know not only particle or field-lines positions and velocities at each step of the simulations, but also the instantaneous field values as seen by particles. This was made to compute the instantaneous value of the particle’s magnetic moment, but other applications are possible too. Accuracy tests of the code are shown for different cases, i.e., particles moving in constant magnetic field, magnetic plus constant electric field and wave field. In addition in the last part of the paper we concentrate our discussion on the study of velocity space diffusion of charged particles in turbulent slab fields, paying attention to the discretization of the fields and the temporal discretization of the dynamical equations. The diffusion of charged particles is a very common topic in plasma physics and astrophysics since it plays an important role in many different phenomena such as stochastic particle acceleration, diffusive shock acceleration, solar energetic particle propagation, and the scattering required for the solar modulation of galactic cosmic rays.

Structure Synth

Structure Synth is a cross-platform application for generating 3D structures by specifying a design grammar. Even simple systems may generate surprising and complex structures. Structure Synth is built in C++, OpenGL, and Qt 4.

See also Context Free Art.


A library of Python routines and C extensions that has been developed to provide a general astronomical data analysis infrastructure.


A package for aligning and combining Hubble Space Telescope images. Starting in July 2012, all drizzled data products obtained from MAST were produced with AstroDrizzle. An abbreviation for Astrometric Drizzle, AstroDrizzle was designed from the ground-up to substantially improve the handling of distortion in the image header World Coordinate System.

AstroDrizzle removes geometric distortion, corrects for sky background variations, flags cosmic-rays, and combines images with optional subsampling. Drizzled data products from MAST are generated for single visit associations only.

To combine data from additional visits, TweakReg may be used to update the image WCS using matched source lists. Once the full set of images of a given target are properly aligned, they may be combined using AstroDrizzle.


PyFITS provides an interface to FITS formatted files in the Python scripting language and PyRAF, the Python-based interface to IRAF. It is useful both for interactive data analysis and for writing analysis scripts in Python using FITS files as either input or output. PyFITS is a development project of the Science Software Branch at the Space Telescope Science Institute.

All of the functionality of PyFITS is now available in Astropy as the package, which is now publicly available. Although we will continue to release PyFITS separately in the short term, including any critical bug fixes, we will eventually stop releasing new versions of PyFITS as a stand-alone product.


PyRAF is a command language for running IRAF tasks that is based on the Python scripting language. It gives users the ability to run IRAF tasks in an environment that has all the power and flexibility of Python. PyRAF can be installed along with an existing IRAF installation; users can then choose to run either PyRAF or the IRAF CL.


A synthetic photometry package.


SuperCollider is an environment and programming language for real time audio synthesis and algorithmic composition. It provides an interpreted object-oriented language which functions as a network client to a state of the art, realtime sound synthesis server.

SuperCollider is an environment and programming language originally released in 1996 by James McCartney for real-time audio synthesis and algorithmic composition. Since then it has been evolving into a system used and further developed by both scientists and artists working with sound. It is an efficient and expressive dynamic programming language providing a framework for acoustic research, algorithmic music, and interactive programming.


An interactive mathematics visualization program. With SURFER you can experience the relation between formulas and forms, i. e. mathematics and art, in an interactive way. You can enter simple equations that produce beautiful images, which are surfaces in space.

Mathematically, the program visualizes real algebraic geometry in real-time. The surfaces shown are given by the zero set of a polynomial equation in the variables x, y and z. All points in space that solve the equation are displayed and form the surface.


An object-oriented particle dynamics code SYMPLER. With this freely available software, simulations can be performed ranging from microscopic classical molecular dynamics up to the Lagrangian particle-based discretisation of macroscopic continuum mechanics equations. We show how the runtime definition of arbitrary degrees of freedom and of arbitrary equations of motion allows for modular and symbolic computation with high flexibility. Arbitrary symbolic expressions for inter-particle forces can be defined as well as fluxes of arbitrarily many additional scalar, vectorial or tensorial degrees of freedom. The integration in a high performance grid computing environment makes huge geographically distributed computational resources accessible to the software by an easy-to-use interface.


SystemTap (stap) is a scripting language and tool for dynamically instrumenting running production Linux kernel-based operating systems. System administrators can use SystemTap to extract, filter and summarize data in order to enable diagnosis of complex performance or functional problems.

SystemTap files written in the SystemTap language (based on the language reference.[7]) run with the stap command-line[8] and are saved as .stp files. The system carries out a number of passes on the script before allowing it to run, at which point the script is compiled into a loadable kernel module and loaded into the kernel. Listing modules shows each SystemTap script as stap_<UUID>. The module is unloaded when the tap has finished running. Scripts generally focus on events (such as starting or finishing a script), compiled-in probe points such as linux "tracepoints", or the execution of functions or statements in the kernel or user-space.


TACO is an object oriented control system originally developed at the European Synchrotron Radiation Facility (ESRF) to control accelerators and beamlines and data acquisition systems. TACO is very scalable and can be used for simple single device laboratory like setups with only a few devices or for a big installation comprising thousands of devices. TACO is a cheap and simple solution for doing distributed home automation. TACO is available free of charge without warranties.

TACO is object oriented because it treats ALL (physical and logical) control points in a control system as objects in a distributed environment. All actions are implement in classes. New classes can be constructed out of existing classes in a hierarchical manner thereby ensuring a high-level of software reuse. Classes can be written in C++, in C (using a methodology called Objects in C), in Python or in LabView (using G).

This has been largedly superseded by TANGO.


Tahoe-LAFS is a Free and Open decentralized cloud storage system. It distributes your data across multiple servers. Even if some of the servers fail or are taken over by an attacker, the entire file store continues to function correctly, preserving your privacy and security.


Taiga greatly simplifies the use of science data. It is a self-sufficient bundle of free/open source software that webifies major scientific data formats, such as NetCDF, HDF4 and HDF5. Through webification (w10n), meta attributes and data arrays inside a file can be directly retrieved, transformed, or manipulated using clear and meaningful URLs.


TakTuk is a tool for deploying parallel remote executions of commands to a potentially large set of remote nodes. It spreads itself using an adaptive algorithm and sets up an interconnection network to transport commands and perform I/Os multiplexing/demultiplexing. The TakTuk mechanics dynamically adapt to environment (machine performance and current load, network contention) by using a reactive work-stealing algorithm that mixes local parallelization and work distribution.

TakTuk is a tool especially suited to the administration of parallel machines because it eases the handling of groups of hosts. It might be used in batch mode for simple machine state tests (e.g. test hosts responsiveness simply by letting the engine setting the network up using its default connector - ssh) or in interactive mode for deep investigation on several hosts, using the TakTuk command interpreter to execute multiple commands on multiple hosts (standard test sequence on a group of hosts, ping pong test between several machines, …​).


The Tensor Contraction Engine (TCE) is a compiler for a domain-specific language that allows chemists to specify complicated scientification computations in quantum chemistry and physics. The TCE searches for an optimal implementation and generates FORTRAN code. First, algebraic transformations are used to reduce the number of operations. We then minimize the storage requirements to fit the computation within the disk limits by fusing loops. We have designed an algorithm that finds the optimal evaluation order if intermediate arrays are allocated dynamically and are working on combining loop fusion with dynamic memory allocation. If the computation does not fit within the disk limits, recomputation must be traded off for a reduction in storage requirements. If the target machine is a multi-processor machine, we optimize the communication cost together with finding a fusion configuration for minimizing storage. Finally, we minimize the data access times by minimizing disk-to-memory and memory-to-cache traffic and generate FORTRAN code. We have completed a first prototype of the TCE and are working on implementing the communication minimization and data access optimization algorithms. In future research, we will extend this approach to handle common subexpressions, symmetric matrices, and sparse matrices.

The Tensor Contraction Engine (TCE) is the application of compiler optimization and source-to-source translation technology to craft a domain specific language for many-body theories in chemistry and physics. The underlying equations of these theories are all expressed as contractions of many-dimensional arrays or tensors There may be many thousands of such terms in any one problem but their regularity means that they can be translated into efficient massively parallel code that respects the boundedness of each level of the memory hierarchy and minimizes overall runtime with effective trade-off of increased computation for reduced memory consumption.


TenEig is a MATLAB toolbox to find eigenpairs of a tensor. TenEig is written in MATLAB based on PSOLVE and provides routines for solving different kinds of eigenvalue problems of a tensor, like E-eigenvalues, eigenvalues, Z-eigenvalues (real E-eigenvalues), H-eigenvalues (real eigenvalues), or more general mode-k eigenvalues. The corresponding eigenpairs are also provided.


A Matlab toolbox for tensor computations.


The Python Tensor Toolbox provides functionalities for the decomposition of tensors in tensor-train format [1] and spectral tensor-train format [2].

Tensor Toolbox

The Tensor Toolbox provides the following classes for manipulating dense, sparse, and structured tensors using MATLAB’s object-oriented features:

  • tensor - A (dense) multidimensional array (extends MATLAB’s current capabilities).

  • sptensor - A sparse multidimensional array.

  • tenmat - Store a tensor as a matrix, with extra information so that it can be converted back into a tensor.

  • sptenmat - Store an sptensor as a sparse matrix in coordinate format, with extra information so that it can be converted back into an sptensor.

  • ttensor - Store a tensor decomposed as a Tucker operator

  • ktensor - Store a tensor decomposed as a Kruskal operator


Terra is a new low-level system programming language that is designed to interoperate seamlessly with the Lua programming language.

Like C, Terra is a simple, statically-typed, compiled language with manual memory management. But unlike C, it is designed from the beginning to interoperate with Lua. Terra functions are first-class Lua values created using the terra keyword. When needed they are JIT-compiled to machine code.


An open source platform for working with collections of texts. It enables students, researchers and teachers to share and collaborate around texts using a simple and intuitive interface. TEXTUS currently enables users to:

  • Collaboratively annotate texts and view the annotations of others

  • Reliably cite electronic versions of texts

  • Create bibliographies with stable URLs to online versions of those texts


The High Altitude Observatory at the National Center for Atmospheric Research has developed a series of numeric simulation models of the Earth’s upper atmosphere, including the upper Stratosphere, Mesosphere, and Thermosphere. The Thermospheric General Circulation Models (TGCM’s) are three-dimensional, time-dependent models of the EARTH’s neutral upper atmosphere. The models use a finite differencing technique to obtain a self-consistent solution for the coupled, nonlinear equations of hydrodynamics, thermodynamics, continuity of the neutral gas and for the coupling between the dynamics and the composition.

Recent models in the series include a self-consistent aeronomic scheme for the coupled Thermosphere/Ionosphere system, the Thermosphere Ionosphere Electrodynamic General Circulation Model (TIEGCM), and an extension of the lower boundary from 97 to 30 km, including the physical and chemical processes appropriate for the Mesosphere and upper Stratosphere, the Thermosphere Ionosphere Mesosphere Electrodynamic General Circulation Model (TIME-GCM). A global mean, or column model, has also been developed in parallel with the TGCM’s. The global mean model is used as a time-dependent, one-dimensional platform from which new chemical, dynamic and numeric schemes are developed and tested before being introduced into the 3-d GCM’s.




NcSOS adds an OGC SOS service to datasets in your existing THREDDS server. It complies with the IOOS SWE Milestone 1.0 templates and requires your datasets be in any of the CF 1.6 Discrete Sampling Geometries.

NcSOS acts like other THREDDS services (such an OPeNDAP and WMS) where as there are individual service endpoints for each dataset. It is best to aggregate your files and enable the NcSOS service on top of the aggregation. i.e. The NcML aggregate of hourly files from an individual station would be a good candidate to serve with NcSOS. Serving the individual hourly files with NcSOS would not be as beneficial.


Styles for a THREDDS server.


An interface definition language and binary communication protocol that is used to define and create services for numerous languages. It is used as a remote procedure call (RPC) framework and was developed at Facebook for "scalable cross-language services development". It combines a software stack with a code generation engine to build services that work efficiently to a varying degree and seamlessly between C#, C++ (on POSIX-compliant systems), Cappuccino, Cocoa, Delphi, Erlang, Go, Haskell, Java, Node.js, OCaml, Perl, PHP, Python, Ruby and Smalltalk. Although developed at Facebook, it is now an open source project in the Apache Software Foundation.

Thrift includes a complete stack for creating clients and servers. The top part is generated code from the Thrift definition. The services generate from this file client and processor code. In contrast to built-in types, created data structures are sent as result in generated code. The protocol and transport layer are part of the runtime library. With Thrift, it is possible to define a service and change the protocol and transport without recompiling the code. Besides the client part, Thrift includes server infrastructure to tie protocols and transports together, like blocking, non-blocking, and multi-threaded servers. The underlying I/O part of the stack is differently implemented for different languages.


Thrust is a parallel algorithms library which resembles the C++ Standard Template Library (STL). Thrust’s high-level interface greatly enhances developer productivity while enabling performance portability between GPUs and multicore CPUs. Interoperability with established technologies (such as CUDA, TBB and OpenMP) facilitates integration with existing software.


Library for the Fourier interpolation of regular three-dimensional data-sets.


TinyOS is an "operating system" designed for low-power wireless embedded systems. Fundamentally, it is a work scheduler and a collection of drivers for microcontrollers and other ICs commonly used in wireless embedded platforms.


Topic modeling is a machine learning method that learns underlying themes in a collection of documents, which can be used to summarize and organize the documents. We have created a method for visualizing topic models, allowing users to explore a corpus by navigating between high level topic descriptions and individual documents, hopefully deepening their understanding of the corpus.


With Toolboxes for Complex Systems we provide a compilation of innovative methods for modern nonlinear data analysis. These methods were developed during scientific research in the Interdisciplinary Center for Dynamics of Complex Systems Potsdam, the Cardiovascular Physics Group at the Humboldt-Universität zu Berlin, and the Potsdam Institute for Climate Impact Research (PIK).


Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed. By using non-blocking network I/O, Tornado can scale to tens of thousands of open connections, making it ideal for long polling, WebSockets, and other applications that require a long-lived connection to each user.

The components are:

  • A web framework (including RequestHandler which is subclassed to create web applications, and various supporting classes).

  • Client- and server-side implementions of HTTP (HTTPServer and AsyncHTTPClient).

  • An asynchronous networking library (IOLoop and IOStream), which serve as the building blocks for the HTTP components and can also be used to implement other protocols.

  • A coroutine library (tornado.gen) which allows asynchronous code to be written in a more straightforward way than chaining callbacks.

The Tornado web framework and HTTP server together offer a full-stack alternative to WSGI. While it is possible to use the Tornado web framework in a WSGI container (WSGIAdapter), or use the Tornado HTTP server as a container for other WSGI frameworks (WSGIContainer), each of these combinations has limitations and to take full advantage of Tornado you will need to use the Tornado’s web framework and HTTP server together.


With the rise of governmental monitoring programs, Tox, a FOSS initiative, aims to be an easy to use, all-in-one communication platform that ensures their users full privacy and secure message delivery. The goal of this project is to create a configuration-free P2P Skype replacement. “Configuration-free” means that the user will simply have to open the program and will be capable of adding people and communicating with them without having to set up an account.


Python binding for Project Tox.


Transmageddon is a video transcoder for Linux and Unix systems built using GStreamer. It supports almost any format as its input and can generate a very large host of output files. The goal of the application was to help people to create the files they need to be able to play on their mobile devices and for people not hugely experienced with multimedia to generate a multimedia file without having to resort to command line tools with ungainly syntaxes.


TsunAWI was developped in the framework of the GITEWS project (German-Indonesian Tsunami Early Warning System). It discretizes the non-linear shallow equations on an unstructured triangular mesh and allows to simulate the propagation of tsunamis from the origin to the inundation on land. It was used to calculate 4500 scenarios for the Indonesian tsunami early warning system.


TT(Tensor Train) format is an efficient way for low-parametric representation of high-dimensional tensors. The TT-Toolbox is a MATLAB implementation of basic operations with tensors in TT-format.


Python implementation of the TT-Toolbox.


Tulip is an information visualization framework dedicated to the analysis and visualization of relational data. Tulip aims to provide the developer with a complete library, supporting the design of interactive information visualization applications for relational data that can be tailored to the problems he or she is addressing.

Written in C++ the framework enables the development of algorithms, visual encodings, interaction techniques, data models, and domain-specific visualizations. One of the goal of Tulip is to facilitates the reuse of components and allows the developers to focus on programming their application. This development pipeline makes the framework efficient for research prototyping as well as the development of end-user applications.

Tulip Python

Tulip Python is a set of modules that exposes to Python almost all the content of the Tulip C++ API. The bindings has been developed with the SIP tool from Riverbank.


Users want to share more and more photos and videos. But mobile networks are fragile. Platform APIs are a mess. Every project builds its own file uploader. A thousand one week projects that barely work, when all we need is one real project, done right.

We are going to do this right. We will solve reliable file uploads for once and for all. A new open protocol for resumable uploads built on HTTP. Simple, cheap, reusable stacks for clients and servers. Any language, any platform, any network.


Twisted is an event-driven networking engine written in Python and licensed under the open source.


A CoAP library for the Twisted framework.


A free and open source web content management system (CMS) based on PHP. It can run on several web servers, such as Apache or IIS, on top of many operating systems. 3 is credited to be highly flexible. It can be extended by new functions without writing any program code. Also, the software is available in more than 50 languages and has a built-in localization system, therefore supports publishing content in multiple languages. Due to its features, scalability and maturity, TYPO3 is used to build and manage websites of different types and size ranges, from small sites for individuals or nonprofit organizations to multilingual enterprise solutions for large corporations.

Delivered with a base set of interfaces, functions and modules, TYPO3’s functionality spectrum is implemented by extensions. More than 5000 extensions are currently available for TYPO3 for download under the GNU General Public License from a repository called the TYPO3 Extension Repository, or TER.


The Uintah software suite is a set of libraries and applications for simulating and analyzing complex chemical and physical reactions. These reactions are modeled by solving partial differential equations on structured adaptive grids using hundreds to thousands of processors (though smaller simulations may also be run on a scientist’s desktop computer). Key software applications have been developed for exploring the fine details of metal containers (encompassing energetic materials) embedded in large hydrocarbon fires. Uintah’s underlying technologies have led to novel techniques for understanding large pool eddy fires as well as new methods for simulating fluid-structure interactions. The software is general purpose in nature and the breadth of simulation domains continues to grow beyond the original focus of the C-SAFE initiative.

UNIX Utilities

The various available flavours.


BusyBox combines tiny versions of many common UNIX utilities into a single small executable. It provides replacements for most of the utilities you usually find in GNU fileutils, shellutils, etc. The utilities in BusyBox generally have fewer options than their full-featured GNU cousins; however, the options that are included provide the expected functionality and behave very much like their GNU counterparts. BusyBox provides a fairly complete environment for any small or embedded system.

BusyBox has been written with size-optimization and limited resources in mind. It is also extremely modular so you can easily include or exclude commands (or features) at compile time. This makes it easy to customize your embedded systems. To create a working system, just add some device nodes in /dev, a few configuration files in /etc, and a Linux kernel.

GNU Core Utilities

The GNU Core Utilities are the basic file, shell and text manipulation utilities of the GNU operating system. These are the core utilities which are expected to exist on every operating system.

The GNU Core Utilities or coreutils is a package of GNU software containing many of the basic tools, such as cat, ls, and rm, needed for Unix-like operating systems. It is a combination of a number of earlier packages, including textutils, shellutils, and fileutils, along with some other miscellaneous utilities.

Heirloom Toolchest

The Heirloom Toolchest is a collection of standard Unix utilities. Derived from original Unix material released as Open Source by Caldera and Sun. Multiple versions of many utilities are provided to approach compatibility with various specifications and Unix flavors. Support for lines of arbitrary length and in many cases binary input data. Support for multibyte characters in UTF-8 and many East Asian encodings. Extensive documentation including a manual page for any utility.


Toybox combines the most common Linux command line utilities together into a single BSD-licensed executable. It’s simple, small, fast, and reasonably standards-compliant (POSIX-2008 and LSB 4.1).

Toybox’s 1.0 release goal is to turn generic Android into a development environment capable of compiling Linux From Scratch. A tiny system built from just toybox, linux, a C library, and a C compiler (such as LLVM or gcc 4.2.1+binutils 2.17) should be able to rebuild itself from source code without needing any other packages.

Toybox is an implementation of some Linux command line utilities started in 2006,[1] and became a BSD-licensed BusyBox alternative.


A set of approximately 100 basic Linux system utilities not included in GNU Core Utilities.

User-Mode Linux

User-Mode Linux is a safe, secure way of running Linux versions and Linux processes. Run buggy software, experiment with new Linux kernels or distributions, and poke around in the internals of Linux, all without risking your main Linux setup.

User-Mode Linux gives you a virtual machine that may have more hardware and software virtual resources than your actual, physical computer. Disk storage for the virtual machine is entirely contained inside a single file on your physical machine. You can assign your virtual machine only the hardware access you want it to have. With properly limited access, nothing you do on the virtual machine can change or damage your real computer, or its software.


UV-CDAT is a powerful and complete front-end to a rich set of visual-data exploration and analysis capabilities well suited for climate-data analysis problems.


Vagrant is a tool for building complete development environments. With an easy-to-use workflow and focus on automation, Vagrant lowers development environment setup time, increases development/production parity, and makes the "works on my machine" excuse a relic of the past.

Vagrant provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.

To achieve its magic, Vagrant stands on the shoulders of giants. Machines are provisioned on top of VirtualBox, VMware, AWS, or any other provider. Then, industry-standard provisioning tools such as shell scripts, Chef, or Puppet, can be used to automatically install and configure software on the machine.


The Vaucanson platform VCSN is a software dedicated to the computation of, and with, finite state machines. Here finite state machines is to be understood in the broadest possible sense: finite automata with output — often called transducers then — or even more generally finite automata with multiplicity, that is, automata that not only accept, or recognize, sequences of symbols but compute for every such sequence a value that is associated with it and which can be taken in any semiring. Hence the variety of situations that can thus be modellized.

VCSN has been designed with (at least) three goals in mind: to allow generic programming of a wide class of finite automata, to provide a language close to the mathematical description of algorithms on automata, to be a free and open software.


We present a new package, VEST (Vector Einstein Summation Tools), that performs abstract vector calculus computations in Mathematica. Through the use of index notation, VEST is able to reduce three-dimensional scalar and vector expressions of a very general type to a well defined standard form. In addition, utilizing properties of the Levi-Civita symbol, the program can derive types of multi-term vector identities that are not recognized by reduction, subsequently applying these to simplify large expressions.


ViennaCL is a free open-source linear algebra library for computations on many-core architectures (GPUs, MIC) and multi-core CPUs. The library is written in C++ and supports CUDA, OpenCL, and OpenMP. In addition to core functionality and many other features including BLAS level 1-3 support and iterative solvers, the latest release family ViennaCL 1.6.x provides fast pipelined iterative solvers including fast sparse matrix-vector products based on CSR-adaptive, a new fully HTML-based documentation, and a new sparse matrix type. Also, a Python wrapper named PyViennaCL is available.


VIGRA stands for "Vision with Generic Algorithms". It’s an image processing and analysis library that puts its main emphasis on customizable algorithms and data structures. VIGRA is especially strong for multi-dimensional images, because many algorithms (e.g. filters, feature computation, superpixels) are implemented for arbitrary high dimensions. By using template techniques similar to those in the C++ Standard Template Library, you can easily adapt any VIGRA component to the needs of your application, without thereby giving up execution speed. As of version 1.7.1, VIGRA also provides extensive Python bindings on the basis of the popular numpy framework.



FAUmachine is a virtual machine, similar in many respects to VMWare[tm], QEMU or Virtual PC[tm]. What distinguishes FAUmachine from these other virtual machines, are the following features:

  • The FAUmachine virtual machine runs as a normal user process (no root privileges or kernel modules needed) on top of (currently) Linux on i386 and AMD64 hardware. The port of FAUmachine to OpenBSD and Mac OS X (intel) is in progress.

  • Fault injection capability for experimentation in FAUmachine.

  • VHDL interpreter for automating experiments and tests based upon our project fauhdlc. We also ship example scripts for our VHDL interpreter that allow the automatic installation of several Linux distributions and other operating systems using the distribution’s cdrom.

  • The CPU of FAUmachine is based on the virtual CPU from Fabrice Bellard’s excellent QEMU simulator, which can execute anything a real x86/AMD64 CPU can execute, too.


KVM (for Kernel-based Virtual Machine) is a full virtualization solution for Linux on x86 hardware containing virtualization extensions (Intel VT or its own file format for input and output (although it can read files in FastInf format.


A toolkit to interact with the virtualization capabilities of recent versions of Linux (and other OSes).


LXC (Linux Containers) is an operating-system-level virtualization environment for running multiple isolated Linux systems (containers) on a single Linux control host. LXC provides operating system-level virtualization through a virtual environment that has its own process and network space, instead of creating a full-fledged virtual machine.

The Linux kernel provides the cgroups functionality that allows limitation and prioritization of resources (CPU, memory, block I/O, network, etc.) without the need for starting any virtual machines, and namespace isolation functionality that allows complete isolation of an applications' view of the operating environment, including process trees, networking, user IDs and mounted file systems.[3]

LXC combines kernel’s cgroups and support for isolated namespaces to provide an isolated environment for applications. Docker can also use LXC as one of its execution drivers, enabling image management and providing deployment services.


micro-CernVM is the heart of the CernVM 3 virtual appliance. It is based on Scientific Linux 6 combined with a custom, virtualization-friendly Linux kernel. This image is also fully RPM based; you can use yum and rpm to install additional packages.

micro-CernVM’s outstanding feature is that it does not require a hard disk image to be distributed (hence "micro"). Instead it is distributed as a CD-ROM image of ~10MB containing a Linux kernel and the CernVM-FS client. The rest of the operating system is downloaded and cached on demand by CernVM-FS. The virtual machine still requires a hard disk as a persistent cache, but this hard disk is initially empty and can be created instantaneously, instead of being pre-created and distributed.


Mirage OS is a library operating system that constructs unikernels for secure, high-performance network applications across a variety of cloud computing and mobile platforms. Code can be developed on a normal OS such as Linux or MacOS X, and then compiled into a fully-standalone, specialised unikernel that runs under the Xen hypervisor.

Since Xen powers most public cloud computing infrastructure such as Amazon EC2 or Rackspace, this lets your servers run more cheaply, securely and with finer control than with a full software stack.

Mirage uses the OCaml language, with libraries that provide networking, storage and concurrency support that work under Unix during development, but become operating system drivers when being compiled for production deployment. The framework is fully event-driven, with no support for preemptive threading.


OpenShift Origin is Red Hat’s open source Platform as a Service (PaaS) offering. OpenShift Origin is an application platform where application developers and teams can build, test, deploy, and run their applications.


OpenVZ is container-based virtualization for Linux. OpenVZ creates multiple secure, isolated Linux containers (otherwise known as VEs or VPSs) on a single physical server enabling better server utilization and ensuring that applications do not conflict. Each container performs and executes exactly like a stand-alone server; a container can be rebooted independently and have root access, users, IP addresses, memory, processes, files, applications, system libraries and configuration files.

OpenVZ software consists of an optional custom Linux kernel and command-line tools (mainly vzctl). Our kernel developers work hard to merge containers functionality into the upstream Linux kernel, making OpenVZ team the biggest contributor to Linux Containers (LXC) kernel, with features such as PID and network namespaces, memory controller, checkpoint-restore (see and much more.


QEMU is a generic and open source machine emulator and virtualizer.

When used as a machine emulator, QEMU can run OSes and programs made for one machine (e.g. an ARM board) on a different machine (e.g. your own PC). By using dynamic translation, it achieves very good performance.

When used as a virtualizer, QEMU achieves near native performances by executing the guest code directly on the host CPU. QEMU supports virtualization when executing under the Xen hypervisor or using the KVM kernel module in Linux. When using KVM, QEMU can virtualize x86, server and embedded PowerPC, and S390 guests.


VirtualBox is a powerful x86 and AMD64/Intel64 virtualization product for enterprise as well as home use. Not only is VirtualBox an extremely feature rich, high performance product for enterprise customers, it is also the only professional solution that is freely available as Open Source Software under the terms of the GNU General Public License (GPL) version 2.

Presently, VirtualBox runs on Windows, Linux, Macintosh, and Solaris hosts and supports a large number of guest operating systems including but not limited to Windows (NT 4.0, 2000, XP, Server 2003, Vista, Windows 7, Windows 8), DOS/Windows 3.x, Linux (2.4, 2.6 and 3.x), Solaris and OpenSolaris, OS/2, and OpenBSD. VirtualBox is being actively developed with frequent releases and has an ever growing list of features, supported guest operating systems and platforms it runs on.


The Xen Project hypervisor is an open-source type-1 or baremetal hypervisor, which makes it possible to run many instances of an operating system or indeed different operating systems in parallel on a single machine (or host). The Xen Project hypervisor is the only type-1 hypervisor that is available as open source. It is used as the basis for a number of different commercial and open source applications, such as: server virtualization, Infrastructure as a Service (IaaS), desktop virtualization, security applications, embedded and hardware appliances. The Xen Project hypervisor is powering the largest clouds in production today.

The hypervisor supports running two different types of guests: Paravirtualization (PV) and Full or Hardware assisted Virtualization (HVM). Both guest types can be used at the same time on a single hypervisor. It is also possible to use techniques used for Paravirtualization in an HVM guest: essentially creating a continuum between PV and HVM. This approach is called PV on HVM.

Paravirtualization (PV) is an efficient and lightweight virtualization technique originally introduced by Xen Project, later adopted by other virtualization platforms. PV does not require virtualization extensions from the host CPU. However, paravirtualized guests require a PV-enabled kernel and PV drivers, so the guests are aware of the hypervisor and can run efficiently without emulation or virtual emulated hardware. PV-enabled kernels exist for Linux, NetBSD, FreeBSD and OpenSolaris. Linux kernels have been PV-enabled from 2.6.24 using the Linux pvops framework. In practice this means that PV will work with most Linux distributions (with the exception of very old versions of distros).

Full Virtualization or Hardware-assisted (HVM) virtualizion uses virtualization extensions from the host CPU to virtualize guests. HVM requires Intel VT or AMD-V hardware extensions. The Xen Project software uses Qemu to emulate PC hardware, including BIOS, IDE disk controller, VGA graphic adapter, USB controller, network adapter etc. Virtualization hardware extensions are used to boost performance of the emulation. Fully virtualized guests do not require any kernel support. This means that Windows operating systems can be used as a Xen Project HVM guest. Fully virtualized guests are usually slower than paravirtualized guests, because of the required emulation.

Mirage OS

Mirage is an exokernel (also called a Cloud Operating System) for constructing secure, high-performance network applications across a variety of cloud computing, embedded and mobile platforms. Mirage OS was initially designed to for cloud use, which is why we call it a Cloud Operating System. Mirage OS applications are developed in a high-level functional programming language (OCaml) on a desktop OS such as Linux or Mac OSX, and is then compiled into a fully-standalone, specialised microkernel. These microkernels run directly on Xen Project hypervisor APIs. Since the Xen Project powers most public clouds such as Amazon EC2, Rackspace Cloud, and many others, Mirage lets your servers run more cheaply, securely and faster in any Xen Project based cloud or hosting service.


XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world’s largest clouds and enterprises. XenServer is an enterprise-class, cloud-proven, virtualization platform that delivers all of the critical features needed for any server and datacenter virtualization implementation.


The NASA Vision Workbench (VW) is a general purpose image processing and computer vision library developed by the Autonomous Systems and Robotics (ASR) Area in the Intelligent Systems Division at the NASA Ames Research Center.


Vispy is a high-performance interactive 2D/3D data visualization library. Vispy leverages the computational power of modern Graphics Processing Units (GPUs) through the OpenGL library to display very large datasets.


Vistle, the VISualization Testing Laboratory for Exascale computing, is an extensible software environment that integrates simulations on supercomputers, post-processing and parallel interactive visualization.

It is under active development at HLRS since 2012 within the European project CRESTA and bwVisu. The objective is to provide a highly scalable successor to COVISE, exploiting data, task and pipeline parallelism in hybrid shared and distributed memory environments with acceleration hardware. Domain decompositions used during simulation can be reused for visualization.

A Vistle work flow consists of several processing modules, each of which is a parallel MPI program that uses OpenMP within nodes. These can be configured graphically or from Python. Shared memory is used for transfering data between modules on a single node. Work flows can be distributed across several clusters.

For rendering in immersive projection systems, Vistle uses OpenCOVER. Visualization parameters can be manipulated from within the virtual environment. Large data sets can be displayed with OpenGL sort-last parallel rendering and depth compositing. For scaling with the simulation on remote HPC resources, a CPU based hybrid sort-last/sort first parallel ray casting renderer is available. "Remote hybrid rendering" allows to combine its output with local rendering, while ensuring smooth interactivity by decoupling it from remote rendering.

The Vistle system is modular and can be extended easily with additional visualization algorithms. Source code is available on GitHub and licensed under the LPGL.


COVISE, the collaborative visualization and simulation environment, is a modular distributed visualization system. As its focus is on visualization of scientific data in virtual environments, it comprises the VR renderer OpenCOVER.


VMD is designed for modeling, visualization, and analysis of biological systems such as proteins, nucleic acids, lipid bilayer assemblies, etc. It may be used to view more general molecules, as VMD can read standard Protein Data Bank (PDB) files and display the contained structure. VMD provides a wide variety of methods for rendering and coloring a molecule: simple points and lines, CPK spheres and cylinders, licorice bonds, backbone tubes and ribbons, cartoon drawings, and others. VMD can be used to animate and analyze the trajectory of a molecular dynamics (MD) simulation. In particular, VMD can act as a graphical front end for an external MD program by displaying and animating a molecule undergoing simulation on a remote computer.


VPaint is an innovative yet simple vector graphics program. This means that unlike pixel-based graphics program (e.g., Photoshop), VPaint allows you to edit any of your pen strokes at any time, and resize your illustrations at any resolution without loss of detail or sharpness. Using VPaint is extremely easy. First, you sketch curves with your mouse, touch screen, or pen tablet. Then, you use the paint bucket to fill closed areas delimited by curves. Finally, you can edit your illustration by dragging painted areas or sculpting curves.


The Whole Atmosphere Community Climate Model (WACCM) is a comprehensive numerical model, spanning the range of altitude from the Earth’s surface to the thermosphere. The development of WACCM is an inter-divisional collaboration that unifies certain aspects of the upper atmospheric modeling of HAO, the middle atmosphere modeling of ACD, and the tropospheric modeling of CGD, using the NCAR Community Earth System Model (CESM) as a common numerical framework.


Web-based Python Data Analysis.

A Distributed Approach to Ocean Etc. Model Data Interoperability -

Testing pyoos ioos.get_observation parser with a Milestone 1 XML template file -

Creating your own IPython Notebook Blog environment on Wakari -

Testing IOOS Infrastructure with Wakari -

Installing Iris on Wakari -

Versioning Wakari Notesbooks on Github -

Running a Shared Wakari Notebook/Environment -


 Walrus is a tool for interactively visualizing large directed graphs in
three-dimensional space. It is technically possible to display graphs
containing a million nodes or more, but visual clutter, occlusion, and other
factors can diminish the effectiveness of Walrus as the number of nodes, or
the degree of their connectivity, increases. Thus, in practice, Walrus is best
suited to visualizing moderately sized graphs that are nearly trees. A graph
with a few hundred thousand nodes and only a slightly greater number of links
is likely to be comfortable to work with.

Walrus computes its layout based on a user-supplied spanning tree. Because the specifics of the supplied spanning tree greatly affect the resulting display, it is crucial that the user supply a spanning tree that is both meaningful for the underlying data and appropriate for the desired insight. The prominence and orderliness that Walrus gives to the links in the spanning tree, in contrast to all other links, means that an arbitrarily chosen spanning tree may create a misleading or ineffective visualization. Ideally, the input graphs should be inherently hierarchical.

Walrus uses 3D hyperbolic geometry to display graphs under a fisheye-like distortion. At any moment, the amount of magnification, and thus the level of visible detail, varies across the display. This allows the user to examine the fine details of a small area while always having a view of the whole graph available as a frame of reference. Graphs are rendered inside a sphere that contains the Euclidean projection of 3D hyperbolic space. Points within the sphere are magnified according to their radial distance from the center. Objects near the center are magnified, while those near the boundary are shrunk. The amount of magnification decreases continuously and at an accelerated rate from the center to the boundary, until objects are reduced to zero size at the latter, which represents infinity. By bringing different parts of a graph to the magnified central region, the user can examine every part of the graph in detail.


Also ridgelets, curvelets, starlets, wedgelets, bandlets, etc. See

for more about the whole zoo.


The Curvelet transform is a higher dimensional generalization of the Wavelet transform designed to represent images at different scales and different angles. CurveLab is a toolbox implementing the Fast Discrete Curvelet Transform, both in Matlab and C++.


In recent years it has turned out that shearlets have the potential to retrieve directional information so that they became interesting for many applications. Moreover the continuous shearlet transform has the outstanding property to stem from a square integrable group representation. However, to use shearlets and the shearlet transform for reasonable applications one needs fast algorithms to compute a discrete shearlet transform. In this tutorial we present the steps towards an implementation of a fast and finite shearlet transform that is only based on the FFT.

The FFST package provides a fast implementation of the Finite Shearlet Transform. Following the the path via the continuous shearlet transform, its counterpart on cones and finally its discretization on the full grid we obtain the translation invariant discrete shearlet transform. Our discrete shearlet transform can be efficiently computed by the fast Fourier transform (FFT). The discrete shearlets constitute a Parseval frame of the finite Euclidean space such that the inversion of the shearlet transform can be simply done by applying the adjoint transform.


The Interactive Sparse Astronomical Data Analysis packages are a collection of packages in IDL, C++ and Matlab related to sparsity and its application in astronomical data analysis. The components include:

  • Sparse2D - sparse decomposition, denoising and deconvolution for 1- and 2-D datasets

  • MSVST - multi-scale variance stabilizing transform for 1- and 2-D datasets

  • MRS - sparse, multiresolution representation on the sphere

  • SparsePOL - polarized wavelets and curvelets on the sphere

  • MRS-MSVSTS - multi-scale variance stabilizing transform on the sphere


This package contains some MatLab tools for multi-scale image processing. Briefly, the tools include:

  • Recursive multi-scale image decompositions (pyramids), including Laplacian pyramids, QMFs, Wavelets, and steerable pyramids. These operate on 1D or 2D signals of arbitrary dimension. Data structures are compatible with the MatLab wavelet toolbox.

  • Fast 2D convolution routines, with subsampling and boundary-handling.

  • Fast point-operations, histograms, histogram-matching.

  • Fast synthetic image generation: sine gratings, zone plates, fractals, etc.

  • Display routines for images and pyramids. These include several auto-scaling options, rounding to integer zoom factors to avoid resampling artifacts, and useful labeling (dimensions and gray-range).


PyWavelets is a free Open Source wavelet transform software for Python programming language. It is written in Python, Cython and C for a mix of easy and powerful high-level interface and the best performance.


Wavelets and their associated transforms are highly efficient when approximating and analyzing one- dimensional signals. However, multivariate signals such as images or videos typically exhibit curvilinear singularities, which wavelets are provably deficient of sparsely approximating and also of analyzing in the sense of, for instance, detecting their direction. Shearlets are a directional representation system extending the wavelet framework, which overcomes those deficiencies. Similar to wavelets, shearlets allow a faithful implementation and fast associated transforms.

This package provides MATLAB code for a novel faithful algorithmic realization of the 2D and 3D shearlet transform (and their inverses) associated with compactly supported universal shearlet systems incorporat- ing the option of using CUDA.


Wayland is intended as a simpler replacement for X, easier to develop and maintain. GNOME and KDE are expected to be ported to it.

Wayland is a protocol that specifies the communication between a display server (called Wayland compositor) and its clients, as well as a reference implementation of the protocol in the C programming language.[4]

Wayland is developed by a group of volunteers led by Kristian Høgsberg as a free and open-source software community-driven project with the aim of replacing the X Window System with a modern, simpler windowing system in Linux and Unix-like operating systems.

Wayland consists of a protocol and a reference implementation named Weston. The project is also developing versions of GTK+ and Qt that render to Wayland instead of to X. Most applications are expected to gain support for Wayland through one of these libraries without modification to the application.


WCSAxes is a framework for making plots of Astronomical data in Matplotlib.


Full duplex messaing between web browsers and servers. It takes care of handling the WebSocket connections, launching your programs to handle the WebSockets, and passing messages between programs and web-browser. It’s like CGI, twenty years later, for WebSockets.


This tool provides an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words. These representations can be subsequently used in many natural language processing applications and for further research.

The word2vec tool takes a text corpus as input and produces the word vectors as output. It first constructs a vocabulary from the training text data and then learns vector representation of words. The resulting word vector file can be used as features in many natural language processing and machine learning applications.


This tool from NASA’s EOSDIS provides the capability to interactively browse global, full-resolution satellite imagery and then download the underlying data. Most of the 100+ available products are updated within three hours of observation, essentially showing the entire Earth as it looks "right now". This supports time-critical application areas such as wildfire management, air quality measurements, and flood monitoring. Arctic and Antarctic views of several products are also available for a "full globe" perspective. Browsing on tablet and smartphone devices is generally supported for mobile access to the imagery.

Worldview uses the Global Imagery Browse Services (GIBS) to rapidly retrieve its imagery for an interactive browsing experience. While Worldview uses OpenLayers as its mapping library, GIBS imagery can also be accessed from Google Earth, NASA World Wind, and several other clients. We encourage interested developers to build their own clients or integrate NASA imagery into their existing ones using these services.


The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed to serve both atmospheric research and operational forecasting needs. It features two dynamical cores, a data assimilation system, and a software architecture facilitating parallel computation and system extensibility. The model serves a wide range of meteorological applications across scales from tens of meters to thousands of kilometers.

WRF allows researchers to generate atmospheric simulations based on real data (observations, analyses) or idealized conditions. WRF offers operational forecasting a flexible and computationally-efficient platform, while providing advances in physics, numerics, and data assimilation contributed by developers in the broader research community. WRF is currently in operational use at NCEP, AFWA, and other centers.


The Community Gridpoint Statistical Interpolation (GSI) system is a variational data assimilation system, designed to be flexible, state-of-art, and run efficiently on various parallel computing platforms. The GSI system is in the public domain and is freely available for community use. The testing and support of this GSI system at the DTC currently focus on regional numerical weather prediction (NWP) applications coupled with the Weather Research and Forecasting (WRF) Model , but the GSI can be applied to Global Forecast System(GFS) as well as other modelling systems.

The GSI version 3.3 GSI is an operational data assimilation system available for community use. Some of these GSI advanced features is listed as follows:

  • Combined with an ensemble system, this version of GSI can be used as an ensemble-variational hybrid data assimilation system.

  • Coupled with forecast models and their adjoint models, GSI can be turned into a four-dimensional variational (4D-Var) system.

  • GSI features capabilities for observation sensitivity calculation.

  • The observation operators in GSI can be used in an EnKF system or other data analysis systems, transforming model variables to observed variables at the observational space.


The Weather Research and Forecasting (WRF) Model is designed to serve both operational forecasting and atmospheric research needs. It features two dynamic cores, multiple physical parameterizations, a variational data assimilation system, ability to couple with an ocean model, and a software architecture allowing for computational parallelism and system extensibility. WRF is suitable for a broad spectrum of applications, including tropical storms.

Two robust configurations of WRF for tropical storms are the NOAA operational model Hurricane WRF (HWRF) and the National Center for Atmospheric Research (NCAR) Advanced Research Hurricane WRF (AHW). In this website users can obtain codes, datasets, and information for running both HWRF and AHW.


The Model Evaluation Tools (MET) verification package is a highly-configurable, state-of-the-art suite of verification tools. It was developed using output from the Weather Research and Forecasting (WRF) modeling system but may be applied to the output of other modeling systems as well.


The NOAA Environmental Modeling System (NEMS) Nonhydrostatic Multiscale Model on the B-grid (NMMB) was developed and continues to be enhanced to establish a common modeling framework that facilitates streamlined interactions of analysis, forecast, and post-processing systems within NCEP. The NEMS architecture is a high performance software superstructure and infrastructure based on the Earth System Modeling Framework (ESMF) for use in operational prediction models at NCEP.

The NAM modeling suite was the first operational implementation of NEMS at NCEP. The NAM prediction model within the NEMS framework is the Nonhydrostatic Multiscale Model on B-grid (NMMB), which can be run globally or regionally with embedded nests.


The Nonhydrostatic Mesoscale Model (NMM) core of the Weather Research and Forecasting (WRF) system is designed to be a flexible, state-of-the-art atmospheric simulation system that is portable and efficient on available parallel computing platforms. WRF-NMM is suitable for use in a broad range of applications across scales ranging from meters to thousands of kilometers. The package includes:

  • Nonhydrostatic Mesoscale Model (NMM) dynamic solver, including one-way and two-way static nesting;

  • WRF Preprocessing System (WPS);

  • Numerous physics packages contributed by WRF partners and the research community; and

  • Unified Post Processor (UPP) software package and sample scripts for several graphical packages.


This code is a reference implementation of our paper Wavelet Turbulence for Fluid Simulation. The code is intended as a pedagogical example, so clarity has been given preference over performance. Optimizations that inhibit readability have been removed, so the running times experienced will be longer than those reported in the paper.


XBeach is a two-dimensional model for wave propagation, long waves and mean flow, sediment transport and morphological changes of the nearshore area, beaches, dunes and backbarrier during storms.


XBee is the brand name from Digi International for a family of form factor compatible radio modules. The first XBee radios were introduced under the MaxStream brand in 2005[2] and were based on the 802.15.4-2003 standard designed for point-to-point and star communications at over-the-air baud rates of 250 kbit/s.

XBee API Mode Tutorial Using Python and Arduino -


An Arduino library for communicating with XBees in API mode, with support for both Series 1 (802.15.4) and Series 2 (ZB Pro/ZNet). This library Includes support for the majority of packet types, including: TX/RX, AT Command, Remote AT, I/O Samples and Modem Status.


Scientists at LLNL have developed an open source, non-intrusive, and general purpose parallel-in-time code, XBraid. The algorithm enables a scalable parallel-in-time approach by applying multigrid to the time dimension. It is designed to be nonintrusive. That is, users apply their existing sequential time-stepping code according to our interface, and then XBraid does the rest. Users have spent years, sometimes decades, developing the right time-stepping scheme for their problem. XBraid allows users to keep their schemes, but enjoy parallelism in the time dimension.

Traditional sequential time-marching algorithms are a critical part of any computer simulation of a time-dependent problem, but these algorithms are currently facing a sequential bottleneck. This bottleneck is driven by the broad trend that future performance gains will come from greater concurrency, not faster clock speeds. Previously, ever-increasing clock speeds decreased the compute time for each time step, thus allowing more time steps to be calculated without increasing the overall compute time. Now that clock speeds are stagnant, further refinements in time (i.e., increases in the number of time steps) will simply increase the simulation’s overall compute time. Many of these refinements in time will be required to maintain balance between spatial and temporal accuracies. Additionally, some simulations are already fully resolved in space, and it is unclear how such simulations will take advantage of the coming increases in concurrency.

LLNL researchers have advanced an alternative solution—solving all of the time steps simultaneously, with the help of a new multilevel algorithm and the massively parallel processing capabilities of current and future high-performance computers. This approach has already shown an ability to dramatically decrease the solution time for some simulations by ten-fold or more.


XIOS stands for XML-IO-SERVER, a library dedicated to I/O management of climate codes and model output.


A software package that allows the fast and easy solution of sets of ordinary, partial and stochastic differential equations, using a variety of efficient numerical algorithms. XMDS2 is a cross-platform, GPL-licensed, open source package for numerically integrating initial value problems that range from a single ordinary differential equation up to systems of coupled stochastic partial differential equations. The equations are described in a high-level XML-based script, and the package generates low-level optionally parallelised C++ code for the efficient solution of those equations. It combines the advantages of high-level simulations, namely fast and low-error development, with the speed, portability and scalability of hand-written code.


The Extensible Messaging and Presence Protocol (XMPP) is an open technology for real-time communication, which powers a wide range of applications including instant messaging, presence, multi-party chat, voice and video calls, collaboration, lightweight middleware, content syndication, and generalized routing of XML data.


XSHELLS is yet another code simulating incompressible fluids in a spherical cavity. In addition to the Navier-Stokes equation with an optional Coriolis force, it can also time-step the coupled induction equation for MHD (with imposed magnetic field or in a dynamo regime), as well as the temperature (or codensity) equation in the Boussineq framework.

XSHELLS uses finite differences (second order) in the radial direction and spherical harmonic decomposition (pseudo-spectral). The time-stepping uses semi-implicit Crank-Nicolson scheme for the diffusive terms, while the non-linear terms can be handled either by an Adams-Bashforth or a Predictor-Corrector scheme (both second order in time).

XSHELLS is written in C++ and designed for speed. It uses the blazingly fast spherical harmonic transform library SHTns, as well as hybrid parallelization using OpenMP and/or MPI. This allows it to run efficiently on your laptop or on parallel supercomputers. A post-processing program is provided to extract useful data and export fields to matlab/octave, python/matplotlib or paraview.


The X-Stack Program was created to support research that targets significant advances in programming models, languages, compilers, runtime systems and tools. The expected results of this program are complete solutions to the system software stack for Exascale computing platforms (X-Stack)which address fundamental challenges identified in the ASCR Exascale Programming Challenges Workshop, captured in the workshop report, as well as the ones identified in the ASCR Exascale Tools Workshop, captured in the workshop report. Solutions being researched involve radically new approaches to programming Exascale applications and algorithms and will demonstrate the viability of such solutions in a broad high performance programming context and will enable automatic semantics and performance preserving transformations of applications (possibly with users in the loop).


ROSE is an open source compiler infrastructure to build source-to-source program transformation and analysis tools for large-scale C(C89 and C98), C(C98 and C++11), UPC, Fortran (77/95/2003), OpenMP, Java, Python and PHP applications.


We are building a generic, extensible compiler infrastructure that can incorporate semantic information from domain-specific libraries to enable transformations that leverage domain-specific properties of library methods. Rather than building domain-specific compilers for each domain, our extensible compiler becomes a domain specific compiler for a domain when paired with domain-specific libraries.


Yael is a library implementing computationally intensive functions used in large scale image retrieval, such as neighbor search, clustering and inverted files. The library offers interfaces for C, Python and Matlab.


Yesod is a Haskell web framework for productive development of type-safe, RESTful, high performance web applications.


LambdaCms is a set of packaged libraries, containing subsites for the Yesod application framework, which allow rapid development of robust and highly performant websites with content management functionality.


Yorick is an interpreted programming language for scientific simulations or calculations, postprocessing or steering large simulation codes, interactive scientific graphics, and reading, writing, or translating large files of numbers. Yorick includes an interactive graphics package, and a binary file package capable of translating to and from the raw numeric formats of all modern computers. Yorick is written in ANSI C and runs on most operating systems.


The yt project aims to produce an integrated science environment for collaboratively asking and answering astrophysical questions. To do so, it will encompass the creation of initial conditions, the execution of simulations, and the detailed exploration and visualization of the resultant data. It will also provide a standard framework based on physical quantities interoperability between codes.


Open Source ESB, SOA, REST, APIs and Cloud Integrations in Python. Build and orchestrate integration services, expose new or existing APIs, either cloud or on-premise, and use a wide range of connectors, data formats and protocols.

Zato facilitates intercommunication across applications and data sources spanning your organization’s business or technical boundaries and beyond, enabling you to access, design, develop or discover new opportunities and processes.

The protocols, standards and formats supported are HTTP, REST, JSON, SOAP, AMQP, JMS WebSphere MQ, ZeroMQ, Redis, SQL, Cassandra, Amazon S3, OpenStack Swift, Odoo/OpenERP, SMTP, IMAP, FTP, Solr, ElasticSearch, publish/subscribe, integration patterns, RBAC, and more.


ZKCM is a C++ library developed for the purpose of multiprecision matrix computation, on the basis of the GNU MP and MPFR libraries. It provides an easy-to-use syntax and convenient functions for matrix manipulations including those often used in numerical simulations in quantum physics. Its extension library, ZKCM_QC, is developed for simulating quantum computing using the time-dependent matrix-product-state simulation method.

Original Section

Notes about how to install and use cool software.

Machine Learning and Data Mining Info

Discovering and Visualizing Patterns with Python

PDF cheatsheet (7 pp.)

Weblog of cheatsheet author:

Machine Learning: An Algorithmic Perspective

A book with lots of Python examples, the code for which is available at the link shown.

Neural Network Emulations for Complex Multidimensional Geophysical Mappings

PDF review paper (34 pp.)

Predicting Solar Energy from Weather Forecasts Using Python

Using Python to read data from NetCDF files and then perform data mining.

Application of Machine Learning Methods to Spatial Interpolation of Environmental Variables

PDF paper (13 pp.)

Review of Spatial Interpolation Methods for Environmental Scientists

PDF technical report (154 pp.)

Climate Informatics

PDF review paper (46 pp.)

Comparing Predictive Power in Climate Data: Clustering Matters

PDF paper (17 pp.)

Applying Machine Learning Methods to Climate Variability

Nonlinear Multivariate and Time Series Analysis by Neural Network Methods

Pattern Recognition in Time Series

PDF paper (28 pp.)

Application of Statistical Learning to Plankton Image Analysis

Machine Learning Algorithms for Real Data Sources with Applications to Climate Science

PDF slides (46 pp.)

Machine Learning for Climate Science

Online slides (196 pp.)

Applicability of Data Mining Techniques for Climate Prediction

PDF paper (4 pp.)

Outstanding Problems at the Interface of Climate Prediction and Data Mining

Online slides (35 pp.)

Unsupervised Machine Learning Techniques for Studying Climate Variability

PDF slides (21 pp.)

Tracking Climate Models

PDF paper (15 pp.)

Streaming Data Mining

PDF slides (229 pp.)

Machine Learning for Hackers

Book (324 pp.) with examples using R.

Python and Matlab


A software framework in Fortran to build large-scale parallel applications. It
is designed for applications using three-dimensional structured mesh and
spatially implicit numerical algorithms. At the foundation it implements a
general-purpose 2D pencil decomposition for data distribution on
distributed-memory platforms. On top it provides a highly scalable and
efficient interface to perform three-dimensional distributed FFTs. The library
is optimised for supercomputers and scales well to hundreds of thousands of
cores. It relies on MPI but provides a user-friendly programming interface
that hides communication details from application developers.

See xref:Incompact3d[Incompact3d].[+][+]


A programming environment for heterogeneous architectures.[+][+][+]


A set of DOE-developed software tools, sometimes in collaboration with other
funding agencies (DARPA, NSF), that make it easier for programmers to write
high performance scientific applications for high-end computers.[+]


A geographical information system to visualize netCDF files via the web. The
software consists of a server side C++ application and a client side
JavaScript application. The software provides several features to access and
visualize data over the web, it uses OGC standards for data dissemination.[+]


The Advanced Data mining And Machine learning System (ADAMS) is a novel,
flexible workflow engine aimed at quickly building and maintaining real-world,
complex knowledge workflows.[+]


A system of computer programs for solving time dependent, free surface
circulation and transport problems in two and three dimensions. These programs
utilize the finite element method in space allowing the use of highly
flexible, unstructured grids.[+][+]


The Adaptable IO System (ADIOS) provides a simple, flexible way for scientists
to describe the data in their code that may need to be written, read, or
processed outside of the running simulation. By providing an external to the
code XML file describing the various elements, their types, and how you wish
to process them this run, the routines in the host code (either Fortran or C)
can transparently change how they process the data.[+]\~pnorbert/ADIOS-UsersManual-1.5.0.pdf[+]


A software library designed to help rapidly build scalable parallel programs.\~rbutler/adlb/[+]


An adaptive mesh refinement package written in Fortran 90.[+]


An opensource object-oriented Finite Element library which has the ambition to
be generic and efficient. Akantu is developed within the LSMS (Computational
Solid Mechanics Laboratory, lsms., where research is conducted at the
interface of mechanics, material science, and scientific computing. The
open-source philosophy is important for any scientific software project
evolution. The collaboration permitted by shared codes enforces sanity when
users (and not only developers) can criticize the implementation details.
Akantu was born with the vision to associate genericity, robustness and
efficiency while benefiting the open-source visibility.[+]


Implementations of a few algorithms and datastructures for fun and profit.[+]


A software package providing a series of algorithms for statistical relational learning and probabilistic logic inference, based on the Markov logic representation.


The Adaptive Mesh generator for Atmospheric and Ocean Simulation is a mesh generator for adaptive algorithms. It is capable of handling complex geometries as well as highly non-uniform refinement regions. It has a relatively simple programming interface and incorporates some optimization. There is even a 3D version of amatos.


The Adaptive Message Passing Interface is an implementation of MPI that supports dynamic load balancing and multithreading for MPI applications.


A machine independent parallel programming system. Programs written using this system will run unchanged on MIMD machines with or without a shared memory. It provides high-level mechanisms and strategies to facilitate the task of developing even highly complex parallel applications.


The Astrophysical Multipurpose Software Environment provides a software framework astrophysical simulations, in which existing codes from different domains, such as stellar dynamics, stellar evolution, hydrodynamics and radiative transfer can be easily coupled. AMUSE uses Python to interface with existing numerical codes. The AMUSE interface handles unit conversions, provides consistent object oriented interfaces, manages the state of the underlying simulation codes and provides transparent distributed computing.


Python bindings for the Armadillo matrix library.


A C++ linear algebra library.


A collection of Fortran77 subroutines designed to solve large scale eigenvalue problems.



A free open-source software program for solving small to very large mathematical models. ASCEND can solve systems of non-linear equations, linear and nonlinear optimisation problems, and dynamic systems expressed in the form of differential/algebraic equations.


A SEJITS implementation for Python. Asp is a research prototype and implementation of SEJITS (Selective, Embedded Just-in-Time Specialization) for Python. With the aid of application-specific specializers, it compiles fragments of Python down to low-level parallelized CPU and GPU implementations.


A Python web framework that makes the most of the filesystem. Simplates are Cacti

A complete network graphing solution designed to harness the power of
RRDTool's data storage and graphing functionality. Cacti provides a fast
poller, advanced graph templating, multiple data acquisition methods, and user
management features out of the box. All of this is wrapped in an intuitive,
easy to use interface that makes sense for LAN-sized installations up to
complex networks with hundreds of devices.[+]


The OpenSource industry standard, high performance data logging and graphing
system for time series data. RRDtool can be easily integrated in shell
scripts, perl, python, ruby, lua or tcl applications.[+]


Cactus is an open source problem solving environment designed for scientists
and engineers. Its modular structure easily enables parallel computation
across different architectures and collaborative code development between
different groups. Cactus originated in the academic research community, where
it was developed and used over many years by a large international
collaboration of physicists and computational scientists.

The name Cactus comes from the design of a central core ("flesh") which
connects to application modules ("thorns") through an extensible interface.
Thorns can implement custom developed scientific or engineering applications,
such as computational fluid dynamics. Other thorns from a standard
computational toolkit provide a range of computational capabilities, such as
parallel I/O, data distribution, or checkpointing.

Cactus runs on many architectures. Applications, developed on standard
workstations or laptops, can be seamlessly run on clusters or supercomputers.
Cactus provides easy access to many cutting edge software technologies being
developed in the academic research community, including the Globus
Metacomputing Toolkit, HDF5 parallel file I/O, the PETSc scientific library,
adaptive mesh refinement, web interfaces, and advanced visualization tools.[+][+]


A computer algebra system (CAS) designed specifically for the solution of
problems encountered in field theory. It has extensive functionality for
tensor computer algebra, tensor polynomial simplification including multi-term
symmetries, fermions and anti-commuting variables, Clifford algebras and Fierz
transformations, implicit coordinate dependence, multiple index types and many
more. The input format is a subset of TeX. Both a command-line and a graphical
interface are available.[+][+]


A framework for convolutional neural network algorithms, developed with speed in mind. Caffe aims to provide computer vision scientists and practitioners with a clean and modifiable implementation of state-of-the-art deep learning algorithms. For example, network structure is easily specified in separate config files, with no mess of hard-coded parameters in the code. At the same time, Caffe fits industry needs, with blazing fast C++/CUDA code for GPU computation.

Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU (≈ 2.5 ms per image). By separating model representation from actual implementation, Caffe allows experimentation and seamless switching among platforms for ease of development and deployment from prototyping machines to cloud environments. Caffe is maintained and developed by the Berkeley Vision and Learning Center (BVLC) with the help of an active community of contributors on GitHub. It powers ongoing research projects, large-scale industrial applications, and startup prototypes in vision, speech, and multimedia.


The Cameleon language is a graphical data flow language following a two-scale paradigm. It allows an easy up-scale that is the integration of any library writing in C++ in the data flow language. Cameleon language aims to democratize macro-programming by an intuitive interaction between the human and the computer where building an application based on a data-process and a GUI is a simple task to learn and to do. Cameleon language allows conditional execution and repetition to solve complex macro-problems. In this paper we introduce a new model based on the extension of the petri net model for the description of how the Cameleon language executes a composition.


A set of libraries for performing various tasks in radoi astronomy.


Python bindings for the casacore radio astronomy libraries.


A simple, portable, high-performance, scalable, and robust communication interface for HPC and Data Centers. Targeted towards high performance computing (HPC) environments as well as large data centers, CCI can provide a common network abstraction layer (NAL) for persistent services as well as general interprocess communication. In HPC, MPI is the de facto standard for communication within a job. Persistent services such as distributed file systems, code coupling (e.g. a simulation sending output to an analysis application sending its output to a visualization process), health monitoring, debugging, and performance monitoring, however, exist outside of scheduler jobs or span multiple jobs. In these cases, these persistent services tend to use either BSD sockets for portability to avoid having to rewrite the applications for each new interconnect or they implement their own NAL which takes developer time and effort. CCI can simplify support for these persistent services by providing a common NAL which minimizes the maintenance and support for these services while providing improved performance (i.e. reduced latency and increased bandwidth) compared to Sockets.


An asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well. The execution units, called tasks, are executed concurrently on a single or more worker servers using multiprocessing, Eventlet, or gevent. Tasks can execute asynchronously (in the background) or synchronously (wait until ready).


Robust messaging for applications.


An advanced key-value store often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets.


Cello is a GNU99 C library which brings higher level programming to C.


A package of high-resolution central schemes for nonlinear conservation laws and related problems.


A compiler infrastructure for the source-to-source transformation of software programs.


A JavaScript library for creating 3D and 2D maps in a web browser without a plugin. The features include:

  • Create data-driven time-dynamic scenes using CZML. See the CZML Guide.

  • Visualize high-resolution worldwide Terrain. See STK World Terrain.

  • Layer imagery from multiple sources, including WMS, TMS, WMTS, OpenStreetMap, Bing Maps, ArcGIS MapServer, Google Earth Enterprise, and standard image files. Each layer can be alpha-blended with the layers below it, and its brightness, contrast, gamma, hue, and saturation can be dynamically changed.

  • Draw GeoJSON and TopoJSON.

  • Draw 3D models using COLLADA and glTF with animations and skins.

  • Draw and style a wide range of geometries

  • Draw the atmosphere, sun, sun lighting, moon, stars, and water.

  • Individual object picking.

  • Camera navigation with mouse and touch handlers for rotate, zoom, pan with inertia, flights, free look, and terrain collision detection.

  • Batching, culling, and JavaScript and GPU optimizations for performance.

  • Precision handling for large view distances (avoiding z-fighting) and large world coordinates (avoiding jitter)


A library to read and write CZML files for Cesium


Implements the CF data model for the reading, writing and processing of data and its metadata.


A Python interface to UNIDATA’s Udunits-2 package with CF extensions.


Demonstrates the theory of convolution underlying engineering systems and signal analysis. Designed to enhance the learning experience, C-Graph features an attractive array of scalable pulses, periodic, and aperiodic signal types of variable frequency fundamental to the study of systems theory. The package displays the spectra of any two waveforms chosen by the user, computes their linear convolution, then compares their circular convolution according to the convolution theorem. Each signal is modelled by a register of N discrete values (samples), and the discrete Fourier Transform (DFT) computed by the Fast Fourier Transform (FFT). Students of signal and systems theory will find GNU C-Graph to be of value in visualizing convolution.


A versatile genetic programming application which includes a command-line client and an interactive console mode. It features built in input-output mapping support, and is user-extensible for complex fitness evaluation in Python and Lisp.


Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora’s capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.

Cilk Plus

Adds simple language extensions to the C and C++ languages to express task and data parallelism. These language extensions are powerful, yet easy to apply and use in a wide range of applications.

This was an MIT research program that got folded into the commercially available Intel C++ Compiler Suite. There is a branch of the GCC compiler development stack that’s also in the process of including Cilk.


The main objectives of the METAFOR project were to develop and promulgate an ipso-facto standard for describing climate models and associated data. This standard has been formalized and named the Common Information Model (CIM). Adoption of the CIM standard will allow the climate science community to nurture an eco-system of CIM compliant tools and services to be integrated into the day to day activities of climate research institutes worldwide. The CIM is an ontology, i.e. an informational model describing a particular domain (i.e. climate science). Such a model is formed using a construct known as a class (e.g. simulation). Classes form relationships with other classes (e.g. a simulation has data). Related classes are grouped into packages. The CIM is formally defined using the Unified Modelling Language.


The Coupled-Layer Architecture for Robotic Autonomy is a reusable robotic software framework. CLARAty is a framework that promotes reusable robotic software. It was designed to support heterogeneous robotic platforms and integrate advanced robotic capabilities from multiple institutions. Consequently, its design had to be portable, modular, flexible and extendable.


A lightweight Clifford algebra template library.


A Python-based software component toolkit providing a flexible problem-solving environment for climate science problems. CliMT consists of two layers: a library of climate modeling components (radiative and convective schemes, dynamical cores etc.), mostly in Fortran; and a Python superstructure providing standardized access to each component and allowing coupling of components to form time-dependent models.


Robustly detects extremes against a time-dependent background in climate and weather time series.


A freely* available software tool for 3D visualizations and scientific calculations that was conceived and written by Dr. Christian Perwass. CLUCalc interprets a script language called ‘CLUScript’, which has been designed to make mathematical calculations and visualisations very intuitive.


An implementation of the constrained natural element method in 2D and 3D. It is written in C++ and has Python and Matlab wrappers.

Coarray Fortran

A SPMD parallel programming model based on a small set of language extensions to Fortran 90. CAF supports access to non-local data using a natural extension to Fortran 90 syntax, lightweight and flexible synchronization primitives, pointers, and dynamic allocation of shared data. An executing CAF program consists of a static collection of asynchronous process images.Rice’s implementation of Coarray Fortran 2.0 is a work in progress. We are working to create an open-source, portable, retargetable, high-quality CAF 2.0 compiler suitable for use with production codes. To achieve portability, our compiler performs a source-to-source translation from CAF to Fortran 90 with calls to our CAF 2.0 runtime library primitives. Our CAF compiler’s generated code can be compiled by any Fortran 90 compiler that supports Cray pointers. To achieve high performance, we generate Fortran 90 that is readily optimizable by vendor compilers. Our CAF 2.0 runtime library uses UC Berkeley’s GASNet library as a substrate for communication. GASNet’s get and put operations are used to read and write remote coarray elements. GASNet’s active message support is used to invoke operations on remote nodes. This capability is used to form teams and to look up information about remote coarrays so that process images can read and write them directly.


The Common Data Access toolbox (CODA) provides a set of interfaces for reading remote sensing data from earth observation data files. These interfaces consist of command line applications, libraries, interfaces to scientific applications (such as IDL and MATLAB), and interfaces to programming languages (such as C, Fortran, Python, and Java).

CODA provides a single interface to access data in a wide variety of data formats, including ASCII, binary, XML, netCDF, HDF4, HDF5, GRIB, RINEX, and SP3. This is done by using a generic high level type hierarchy mapping for each data format. For self describing formats such as netCDF, HDF, and GRIB, CODA will automatically construct this mapping based on the file itself. For raw ASCII and binary (and partially also XML) formats CODA makes use of an external format definition stored in .codadef files to determine this mapping. On the download section of this website you will find .codadef files for various earth observation missions that can be used with CODA.


The COllaborative DEvelopment SHell project provides an automatic persistent logbook for sessions of personal command-line work by recording what and how is being done: for private use/reuse and for sharing selected parts with collaborators.


The primary interface for managing Anaconda installations. It can query and search the Anaconda package index and current Anaconda installation, create new Anaconda environments, and install and update packages into existing Anaconda environments.


A scientific tool for the numerical integration of dynamical systems whose mutual couplings are described by a network. Its name is an abbreviation of “Complex Networks Dynamics”.

Conedy supports different dynamical systems with various integration schemes, including ordinary differential equations, iterated maps, stochastic differential equations, and pulse coupled oscillators which are handled via events. In addition, it provides a simple way to handle arbitrary node dynamics. Each dynamical system is associated with a node in a network and edges between such nodes represent couplings. Conedy provides functions to build a network from various node and edge types.

Connectivity Modeling System

A community multiscale modeling system, based on a stochastic Lagrangian framework. It was developed to study complex larval migrations and give probability estimates of population connectivity. In addition, the CMS can also provide a Lagrangian descriptions of oceanic phenomena (advection, dispersion, retention) and can be used in a broad range of applications, from the dispersion and fate of pollutants to marine spatial conservation.


ConTeXt can be used to typeset complex and large collections of documents, like educational materials, user guides and technical manuals. Such documents often have high demands regarding structure, design and accessibility. Ease of maintenance, reuse of content and typographic consistency are important prerequisites. ConTeXt is developed for those who are responsible for producing such documents. ConTeXt is written in the typographical programming language TeX. For using ConTeXt, no TeX programming skills and no technical background are needed. Some basic knowledge of typography and document design will enable you to use the full power of ConTeXt.


A collection of open-source optimization-related Python packages that supports a diverse set of optimization capabilities for formulating and analyzing optimization models.


A data parallel subset of Python which can be dynamically compiled and executed on parallel platforms. Currently, we target NVIDIA GPUs, as well as multicore CPUs through OpenMP and Threading Building Blocks (TBB).


A generic web service and offline processing tool developed within the Centre for Environmental Data Archival (CEDA). The CEDA OGC web services (COWS) is a set of Python libraries that allow rapid development and deployment of geospatial web applications and services built around the standards managed by the Open Geospatial Consortium [OGC]. A Python software framework for implementing Open Geospatial Consortium web service standards. COWS emphasises rapid service development by providing a lightweight layer of OGC web service logic on top of Pylons [Pylons], a mature web application framework for the Python language. This approach provides developers with a flexible web service development environment without compromising access to the full range of web application tools and patterns: Model-View-Controller paradigm, XML templating, Object-Relational-Mapper integration and authentication/authorisation. COWS contains pre-configured implementations of WMS, WCS and WFS services, a web client and WPS.


A set of libraries providing a comprehensive, efficient and robust softwae toolkit for creating automated astronomical data-reduction tasks.

python-cpl ~~~~

Python interface to CPL.

CPython Compiler Tools

Various compiler tools for Python.


The Community Surface Dynamics Modeling System (CSDMS) deals with the Earth’s surface - the ever-changing, dynamic interface between lithosphere, hydrosphere, cryosphere, and atmosphere. We are a diverse community of experts promoting the modeling of earth surface processes by developing, supporting, and disseminating integrated software modules that predict the movement of fluids, and the flux (production, erosion, transport, and deposition) of sediment and solutes in landscapes and their sedimentary basins.


A library for multidimensional numerical integration. The Cuba library offers a choice of four independent routines for multidimensional numerical integration: Vegas, Suave, Divonne, and Cuhre. All four have a C/C++, Fortran, and Mathematica interface and can integrate vector integrands. Their invocation is very similar, so it is easy to substitute one method by another for cross-checking. For further safeguarding, the output is supplemented by a chi-square probability which quantifies the reliability of the error estimate.


Light-weight Python framework and OLAP HTTP server for easy development of reporting applications and aggregate browsing of multi-dimensionally modeled data.


CUDA™ is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).


A fast C++/CUDA implementation of convolutional (or more generally, feed-forward) neural networks. It can model arbitrary layer connectivity and network depth. Any directed acyclic graph of layers will do. Training is done using the back-propagation algorithm.


A ") is a suite engine and meta-scheduler that specializes in suites of cycling tasks for weather and climate forecasting and related processing (it can also be used for one-off workflows of non-cycling tasks, which is a simpler problem).


A JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation.


A software package for numerical simulation of river hydraulics (2D / 1D). It is designed especially for parameter identification, calibration and variational data assimilation. It is interfaced with few pre and post-processors.


A lightweight data management application developed in Python that primarily targets the management of huge data accumulations, often encountered in the scientific field. The system is able to handle large amounts of data and can be easily integrated in existing working environments. It can be optimised to fit any situation by embedding scripts.


A robust real-time streaming data engine that lets you quickly stream live data from experiments, labs, web cams and even Java enabled cell phones. It acts as a "black box" to which applications and devices send and receive data. Think of it as express delivery for your data, be it numbers, video, sound or text. DataTurbine is a buffered middleware, not simply a publish/subscribe system. It can receive data from various sources (experiments, web cams, etc) and send data to various sinks (visualization interfaces, analysis tools, databases, etc). It has "TiVO" like functionality that lets applications pause and rewind live streaming data.


A crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data.


A novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data structures transparent. It works in perfect harmony with parallelisation mechanism such as multiprocessing and SCOOP.


A pseudospectral solver for fluid equations. Its primary applications are in Astrophysics and Cosmology. Written primarily in python, and making use of the FFTW libraries, Dedalus aims to be a simple, fast, and elegant hydrodynamic and magnetohydrodynamic code.


An open source software for spatial data infrastructures and the geospatial web. deegree includes components for geospatial data management, including data access, visualization, discovery and security. Open standards are at the heart of deegree. The software is built on the standards of the Open Geospatial Consortium (OGC) and the ISO Technical Committee 211. It includes the OGC Web Map Service (WMS) reference implementation, a fully compliant Web Feature Service (WFS) as well as packages for Catalogue Service (CSW), Web Coverage Service (WCS), Web Processing Service (WPS) and Web Map Tile Service (WMTS).


A modeling suite to investigate hydrodynamics, sediment transport and morphology and water quality for fluvial, estuarine and coastal environments. The FLOW module is the heart of Delft3D and is a multi-dimensional (2D or 3D) hydrodynamic (and transport) simulation programme which calculates non-steady flow and transport phenomena resulting from tidal and meteorological forcing on a curvilinear, boundary fitted grid or sperical coordinates. In 3D simulations, the vertical grid is defined following the so-called sigma coordinate approach or Z-layer approach. The MOR module computes sediment transport (both suspended and bed total load) and morphological changes for an arbitrary number of cohesive and non-cohesive fractions. Both currents and waves act as driving forces and a wide variety of transport formulae have been incorporated. For the suspended load this module connects to the 2D or 3D advection-diffusion solver of the FLOW module; density effects may be taken into account. An essential feature of the MOR module is the dynamic feedback with the FLOW and WAVE modules, which allow the flows and waves to adjust themselves to the local bathymetry and allows for simulations on any time scale from days (storm impact) to centuries (system dynamics). It can keep track of the bed composition to build up a stratigraphic record. The MOR module may be extended to include extensive features to simulate dredging and dumping scenarios.


Dexy was created out of a desire to unify software documentation and scientific document automation, resulting in a tool that is better at both of these than anything that has gone before.


A ightweight job execution control framework for parallel scientific applications. DIANE improves the reliability and efficiency of job execution by providing automatic load balancing, fine-grained scheduling and failure recovery. DIANE provides an environment in which the existing applications may be more easily ported to heterogenous computing environments such as the Grid, batch farms or interactive clusters. The default scheduling plugin algorithms are suited for bag of tasks applications and data-parallel problems with no inter-task communication. However the framework is designed to make it easy to plug in other scheduling algorithms for more complex task synchronization patterns and workflows, for example DAG4DIANE plugin provides support for directed acyclic graph (DAG) applications, MOTEUR plugin provides support for workflow applications.


An EOF-based method to fill in missing data from geophysical fields, such as clouds in sea surface temperature.


A C++ library for computing persistent homology.


A lightweight, open-source framework for distributed computing based on the MapReduce paradigm.


A version of NumPy that parallelizes array operations in a manner completely transparent to the user - from the perspective of the user, the difference the next.


The Distributed and Unified Numerics Environment is a modular toolbox for solving partial differential equations (PDEs) with grid-based methods. It supports the easy implementation of methods like Finite Elements (FE), Finite Volumes (FV), and also Finite Differences (FD).


A project to develop a new dynamical core for LMD-Z, the atmospheric general circulation model (GCM) part of IPSL-CM Earth System Model.


An an open architecture, open source public software for data acquisition, processing, archival and distribution. Originally developed by the United States Geological Survey, Earthworm binaries and source files are freely available to everyone.


Python wrapper for accessing an Earthworm shared memory ring.


A visual analytics tool for exploring multivariate data sets. EDEN helps you see the associations among variables for guided analysis. EDEN harnesses the parallel coordinates visualization technique and is augmented with graphical indicators of key descriptive statistics.


A C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.


Elixir is a functional, meta-programming aware language built on top of the Erlang VM. It is a dynamic language with flexible syntax and macro support that leverages Erlang’s abilities to build concurrent, distributed and fault-tolerant applications with hot code upgrades.

Ellipsoidal Potential Theory

Open-source (BSD) implementations of ellipsoidal harmonic expansions for solving problems of potential theory using separation of variables.


An open source multiphysical simulation software mainly developed by CSC - IT Center for Science (CSC). Elmer development was started 1995 in collaboration with Finnish Universities, research institutes and industry. After it’s open source publication in 2005, the use and development of Elmer has become international.

Elmer includes physical models of fluid dynamics, structural mechanics, electromagnetics, heat transfer and acoustics, for example. These are described by partial differential equations which Elmer solves by the Finite Element Method (FEM).


An extensible, pure-Python implementation of Goodman & Weare’s Affine Invariant Markov chain Monte Carlo (MCMC) Ensemble sampler. It’s designed for Bayesian parameter estimation.


Empirical Mode Decomposition is an algorithm that finds common rotational modes among all the channels of n-channel data, and is a generic multidimensional extension of the standard EMD.


This open source, digitizing software converts an image file showing a graph or map, into numbers. The image file can come from a scanner, digital camera or screenshot. The numbers can be read on the screen, and written or copied to a spreadsheet.

The process starts with an image file containing a graph or map. The final result is digitized data that can be used by other tools such as Microsoft Excel and Gnumeric.


The EnKF is a sophisticated sequental data assimilation method. It applies an ensemble of model states to represent the error statistics of the model estimate, it applies ensemble integrations to predict the error statistics forward in time, and it uses an analysis scheme which operates directly on the ensemble of model states when observations are assimilated. The EnKF has proven to efficiently handle strongly nonlinear dynamics and large state spaces and is now used in realistic applications with primitive equation models for the ocean and atmosphere.

Enthought Tool Suite

A suite of Python tools for constructing custom scientific applications.