Software Repositories

List of Awesome Lists -

An awesome list is a Github entity that lists interesting links for a given software topic. This are metalists of such lists.

52 North Github -

Open source geospatial software.

ACM Queue -

Interesting occasional essays about software topics from clever ACM folks.

Clever Algorithms: Nature-Inspired Programming Recipes -

Example Ruby programs from a book about nature-inspired programming algorithms.

A repository of information about cluster computing.

A very large collection of clever command-line examples.

CPUShack Museum -

A CPU history museum.

Interrelationships between natural processes, computational systems and procedural-based art practices.

DevFreeBooks -

A huge collection of free computer programming books for developers.

Experimental Mathematics -

A repository of information on experimental and computer-assisted mathematics.

Fedora-Related Software Repositories

An open venue for discussing all aspects of the Fortran programming language and scientific computing.

A list of free programming books in many popular and obscure languages.

Hackery, Math and Design -

Interesting art and musings about computer-related topics.

Hacking for Artists -

A list of software packages useful to artists.


A list of software packages related to the IoT next big thing.

Libre Graphics World -

A blog about computer graphics.

Linux Virtualization Wiki -

A wiki dedicated documenting the different virtualization technologies available in Linux, including an overview of the way each virtualization technology works, how to get started, where to get involved with development, etc.

A gateway to modern mathematics.

NASA History Series Publications -

Titles published in the NASA History Series of publications.

National Science Digital Library -

Provides high quality online educational resources for teaching and learning, with current emphasis on the sciences, technology, engineering, and mathematics (STEM) disciplines—both formal and informal, institutional and individual, in local, state, national, and international educational settings.

Nature of Code (The) -

A free online book: "We want to take a look at something that naturally occurs in our physical world, then determine how we can write code to simulate that occurrence."

New General Catalog of Old Books and Authors -

The aim of this site is to catalog all deceased authors, all authors of books published before 1964, and at least some more recent authors, including their full name(s), date of death, date of birth, pseudonyms, sex & nationality (for those who died from 1920 onwards), and their books published before 1964.

Practical Common Lisp -

Pythonic Perambulations - -

txt2re Regular Expression Generator -

Virt Tools Blog Planet -



Over 800 math and science related programs written in Fortran 90.

New Section

0 | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z


3DEX is a Fortran/CXX package providing programs and functions to perform fast Fourier-Bessel decomposition of 3D fields. It can be applied to cosmological data or 3D data in spherical coordinates in other scientific fields. We present an equivalent formulation of the spherical Fourier-Bessel decomposition that separates radial and tangential calculations. We propose the use of the existing pixelisation scheme HEALPix for a rapid calculation of the tangential modes. 3DEX (3D EXpansions) is a public code for fast spherical Fourier-Bessel decomposition of 3D all-sky surveys that takes advantage of HEALPix for the calculation of tangential modes. 3DEX can also be used in other disciplines, where 3D data are to be analysed in spherical coordinates.


An approximate compiler for C and C++ programs based on Clang. Think of it as your assistant in breaking your program in small ways to trade off correctness for performance.


AMD Core Math Library, or ACML, provides a free set of thoroughly optimized and threaded math routines for HPC, scientific, engineering and related compute-intensive applications. ACML is ideal for weather modeling, computational fluid dynamics, financial analysis, oil and gas applications and more. ACML consists of the following main components:

  • A full implementation of Level 1, 2 and 3 Basic Linear Algebra Subroutines (BLAS), with key routines optimized for high performance on AMD Opteron™ processors. The BLAS level 3 routines will take advantage of heterogeneous computing through OpenCL if detected.

  • A full suite of LAPACK routines. As well as taking advantage of the highly-tuned BLAS kernels, a key set of LAPACK routines has been further optimized to achieve considerably higher performance than standard LAPACK implementations.

  • Beginning version 6 of ACML, a subset of FFTW interfaces are supported for Fourier transform functionality. Heterogeneous compute with GPU/APU and OpenCL is supported through the FFTW interfaces. A comprehensive set of FFTs through ACML specific API (found in version 5 and older) continues to be available in version 6.

  • Random Number Generators in both single- and double-precision.

Active Papers

ActivePapers is a research and development project whose aim is to make computational science more open and more reliable, by making computational reproducible and publishable. It is a file format for storing computations.

An ActivePaper is a file combining datasets and programs working on these datasets in a single package, which also contains a detailed history of which data was produced when, by running which code, and on which machine. It is a complete record of the state of a computational research project that can be shared among collaborators and in the end published as supplementary material to a journal article.


An approximate MAP decoder with Alternating Direction Dual Decomposition. AD3 (Alternating Directions Dual Decomposition) is an LP-MAP decoder for undirected constrained factor graphs. In other words, it is an approximate MAP decoder that retrieves the solution of an LP relaxation of the original problem.

The input is a factor graph, which may contain both soft factors, associated with log-potentials, and hard constraint factors, associated with a logic function. Factors can be dense, sparse, or combinatorial. Specialized factors can be implemented by the practitioner.

The output is the LP-MAP assignment, with a posterior value for each variable. If all variables are integer, the relaxation is tight and the solution is the true MAP. Otherwise, some entries can be in the unit interval. External tools can be used to obtain a valid solution using rounding heuristics. Optionally, a flag can be set that applies a branch-and-bound procedure and retrieves the true MAP (but it can be slow if the relaxation has many fractional components).


Adaptive Hydraulics (AdH) is a modern, multi-dimensional modeling system for saturated and unsaturated groundwater, overland flow, three-dimensional Navier-Stokes flow, and two- or three-dimensional shallow water problems. Developed by the Coastal and Hydraulics Laboratory at the Engineer Research and Development Center in Vicksburg, MS, the 2-dimensional (2D) shallow water module of AdH was released to the public in September 2007.


This site outlines the Aquatic Ecodynamics (AED) modelling library - an open-source community-driven library of model components for simulation of "aquatic ecodynamics" - water quality, habitat and aquatic ecosystem dynamics.

The AED library consists of numerous modules that are designed as individual model ‘components’ able to be configured in a way that facilitates custom aquatic ecosystem conceptualisations – either simple or complex. Users select water quality and ecosystem variables they wish to simulate and then are able to customize connections and dependencies with other modules, including support for easy customisation at an algorithm level how model components operate (e.g. photosynthesis functions, sorption algorithms etc). In general, model components consider the cycling of carbon, nitrogen and phosphorus, and other relevant components such as oxygen, and are able to simulate organisms including different functional groups of phytoplankton and zooplankton, and also organic matter. Modules to support simulation of water column and sediment geochemistry, including coupled kinetic-equilibria, are also included.


The Stochastic Simulation Algorithm (SSA) developed by Gillespie provides a powerful mechanism for exploring the behavior of chemical systems with small species populations or with important noise contributions. Gene circuit simulations for systems biology commonly employ the SSA method, as do ecological applications. This algorithm tends to be computationally expensive, so researchers seek an efficient implementation of SSA. In this program package, the Accelerated Exact Stochastic Simulation Algorithm (AESS) contains optimized implementations of Gillespieʼs SSA that improve the performance of individual simulation runs or ensembles of simulations used for sweeping parameters or to provide statistically significant results.


An open source, science-as-a-service API platform for powering your digital lab. Agave allows you to bring together your public, private, and shared high performance computing (HPC), high throughput computing (HTC), Cloud, and Big Data resources under a single, web-friendly REST API. It enables you to run scientific codes on HPC, HTC or cloud resources, manage your data from a web interface, and remember how you did it.


A simple Python binding for the Agave API.


Akaros is an open source, GPL-licensed operating system for manycore architectures. Our goal is to provide support for parallel and high-performance applications and to scale to a large number of cores.


Albany is an implicit, unstructured grid, finite element code for the solution and analysis of partial differential equations. Albany is the main demonstration application of the AgileComponents software development strategy at Sandia. It is a PDE code that strives to be built almost entirely from functionality contained within reusable libraries (such as Trilinos/STK/Dakota/PUMI). Albany plays a large role in demonstrating and maturing functionality of new libraries, and also in the interfaces and interoperability between these libraries. It also serves to expose gaps in our coverage of capabilities and interface design.

The highlight of Albany is the PDE assembly. The template-based generic programming approach allows developers to just program for residual equations, and all manner of derivatives and polynomial propagations get automatically computed with no development effort. This approach uses Phalanx for rapid and flexible addition of physics, which works closely with Sacado and Stokhos for automatic propagation of derivatives and UQ. The Trilinos Intrepid and Shards packages are used for the local discretization. A second strength of Albany is the demonstration of transformational analysis algorithms. Albany demonstrates the direct use of all Solver/Analysis tools in Trilinos (through Piro, which was developed in Albany) including NOX, LOCA, Rythmos, Stokhos, and all of Dakota. On any problem we not only get a solution, but can also get sensitivities, run optimization problems, and perform uncertainty quantification. All of these approaches can access all of the linear solver options in Trilinos that are exposed by the Stratimikos layer. The third main strength is the early adoption of STK, the sierra toolkit libraries. This includes the mesh database, IO, and mesh adaptation capabilities from stk_rebalance and stk_adapt.


The Arcade Learning Environment (ALE) is a simple object-oriented framework that allows researchers and hobbyists to develop AI agents for Atari 2600 games. It is built on top of the Atari 2600 emulator Stella and separates the details of emulation from agent design.


Amahi is software that runs on a dedicated PC as a central computer for your home. It handles your entertainment, storage, and computing needs. You can store, organize and deliver your recorded TV shows, videos and music to media devices in your network. Share them locally or safely around the world. And it’s expandable with a multitude of one-click install apps.


AMD LibM is a software library containing a collection of basic math functions optimized for x86-64 processor based machines. It provides many routines from the list of standard C99 math functions. AMD LibM is a C library, which users can link in to their applications to replace compiler-provided math functions. Generally, programmers access basic math functions through their compiler. But those who want better accuracy or performance than their compiler’s math functions can use this library to help improve their applications. Users can also take advantage of the vector functions in this library. The vector variants can be used to speed up loops and perform math operations on multiple elements conveniently.


AMG2013 is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. It has been derived directly from the BoomerAMG solver in the hypre library, a large linear solver library that is being developed in the Center for Applied Scientific Computing (CASC) at LLNL. The driver provided in the benchmark can build various test problems. The default problem is a Laplace type problem on an unstructured domain with various jumps and an anisotropy in one part.

AMG2013 is written in ISO-C. It is an SPMD code which uses MPI as well as OpenMP. Parallelism is achieved by data decomposition. The driver provided with AMG2013 achieves this decomposition by simply subdividing the grid into logical P x Q x R (in 3D) chunks of equal size. The benchmark was designed to test parallel weak scaling efficiency.

AMG2013 is a highly synchronous code. The communications and computations patterns exhibit the surface-to-volume relationship common to many parallel scientific codes. Hence, parallel efficiency is largely determined by the size of the data "chunks" mentioned above, and the speed of communications and computations on the machine. AMG2013 is also memory-access bound, doing only about 1-2 computations per memory access, so memory-access speeds will also have a large impact on performance.


AMGCL is a C++ header only library for constructing an algebraic multigrid (AMG) hierarchy. AMG is one the most effective methods for solution of large sparse unstructured systems of equations, arising, for example, from discretization of PDEs on unstructured grids [5,6]. The method can be used as a black-box solver for various computational problems, since it does not require any information about the underlying geometry. AMG is often used not as a standalone solver but as a preconditioner within an iterative solver (e.g. Conjugate Gradients, BiCGStab, or GMRES).

AMGCL builds the AMG hierarchy on a CPU and then transfers it to one of the provided backends. This allows for transparent acceleration of the solution phase with help of OpenCL, CUDA, or OpenMP technologies. Users may provide their own backends which enables tight integration between AMGCL and the user code.

Anaconda Accelerate

Accelerate is an add-on to Continuum’s free enterprise Python distribution, Anaconda. It opens up the full capabilities of your GPU or multi-core processor to Python. Accelerate includes two packages that can be added to your Python installation: NumbaPro and MKL Optimizations. MKL Optimizations makes linear algebra, random number generation, Fourier transforms, and many other operations run faster and in parallel. NumbaPro builds fast GPU and multi-core machine code from easy-to-read Python and NumPy code with a Python-to-GPU compiler.

If you are an academic at a degree-granting institution, all of these add-ons are free of charge. Simply click Anaconda Academic License and fill out the form. If your email address ends in .edu or is in our list of approved academic institutions, the license will be automatically sent to the provided email.

Accelerated Computing with Python

Python is one of the fastest growing and most popular programming languages available. However, as an interpreted language, it has been considered too slow for high-performance computing. That has now changed with the release of the NumbaPro Python compiler from Continuum Analytics.

CUDA Python – Using the NumbaPro Python compiler, which is part of the Anaconda Accelerate package from Continuum Analytics, you get the best of both worlds: rapid iterative development and all other benefits of Python combined with the speed of a compiled language targeting both CPUs and NVIDIA GPUs.


Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy. Avoid writing scripts or custom code to deploy and update your applications— automate in a language that approaches plain English, using SSH, with no agents to install on remote systems.

Not only can it be used for automated configuration management, but it also excels at orchestration, provisioning of systems, zero-time rolling updates and application deployment. Ansible can be used to keep all your systems configured exactly the way you want them, and if you have many identical systems, Ansible will ensure they stay identical. For Linux system administrators, Ansible is an indispensable tool in implementing and maintaining a strong security posture.

Ansible can be used to deploy and configure multiple Linux servers (Red Hat, Debian, CentOS, OS X, any of the BSDs and others) using secure shell (SSH) instead of the more common client-server methodologies used by other configuration management packages.


Ansible playbooks for deploying OpenStack.


A set of Ansible playbooks to build and maintain your own private cloud: email, calendar, contacts, file sync, IRC bouncer, VPN, and more.


ANUGA is a Free & Open Source Software (FOSS) package capable of modelling the impact of hydrological disasters such as dam breaks, riverine flooding, storm-surge or tsunamis.

ANUGA is based on the Shallow Water Wave Equation discretised to unstructured triangular meshes using a finite-volumes numerical scheme. A major capability of ANUGA is that it can model the process of wetting and drying as water enters and leaves an area. This means that it is suitable for simulating water flow onto a beach or dry land and around structures such as buildings. ANUGA is also capable of modelling difficult flows involving shock waves and rapidly changing flow speed regimes (transitions from sub critical to super critical flows).


AnyDSL is a framework for the rapid development of domain-specific languages (DSLs). AnyDSL’s main ingredient is AnyDSL’s intermediate representation Thorin. In contrast to other intermediate representations, Thorin features certain abstractions which allow to maintain domain-specific types and control-flow.

As creating a front-end for some language is a complex and time-consuming endeavor, we offer Impala. This is an imperative language which features as a basis well-known imperative constructs. A DSL developer can hijack Impala such that desired domain-specific types and constructs are available in Impala simply by declaring them. The DSL developer just reuses Impala’s infrastructure (lexer, parser, semantic analysis, and code generator). He does not need to develop his own front-end. Even more important: The decision how to implement domain-specific details is postponed to the expert of the target machine.


AMD OpenCL™ Accelerated Parallel Processing (APP) technology is a set of advanced hardware and software technologies that enable AMD graphics processing cores (GPU), working in concert with the system’s x86 cores (CPU), to execute heterogeneously to accelerate many applications beyond just graphics. This enables better balanced platforms capable of running demanding computing tasks faster than ever, and sets software developers on the path to optimize for AMD Accelerated Processing Units (APUs). The AMD APP Software Development Kit (SDK) is a complete development platform created by AMD to allow you to quickly and easily develop applications accelerated by AMD APP technology. The SDK provides samples, documentation, and other materials to quickly get you started leveraging accelerated compute using OpenCL™, Bolt, or C++ AMP in your C/C++ application, or Aparapi for your Java application.


This package supports a flexible, arbitrarily high level of numeric precision — the equivalent of hundreds or even thousands of decimal digits (up to approximately ten million digits if needed). Special routines are provided for extra-high precision (above 1000 digits). The entire library is written in C++. High-precision real, integer and complex datatypes are supported. Both C++ and Fortran-90 translation modules are also provided that permit one to convert an existing C++ or Fortran-90 program to use the library with only minor changes to the source code. In most cases only the type statements and (in the case of Fortran-90 programs) read/write statements need be changed. Six implementations of PSLQ (one-, two- and three-level, regular and multi-pair) are included, as well as three high-precision quadrature programs. New users are encouraged to use this package, rather than MPFUN90 or MPFUN77 (see below).

This verion of the ARPREC package now includes "The Experimental Mathematician’s Toolkit", which is available as the program "mathtool" in the subdirectory "toolkit". This is a complete interactive high-precision arithmetic computing environment. One enters expressions in a Mathematica-style syntax, and the operations are performed using the ARPREC package, with a level of precision that can be set from 100 to 1000 decimal digit accuracy. Variables and vector arrays can be defined and referenced. This program supports all basic arithmetic operations, common transcendental and combinatorial functions, multi-pair PSLQ (one-, two- or three-level versions), high-precision quadrature, i.e. numeric integration (Gaussian, error function or tanh-sinh), and summation of series.


ArrayFire is a high performance software library for parallel computing with an easy-to-use API. Its array based function set makes parallel programming simple. ArrayFire’s multiple backends (CUDA, OpenCL and native CPU) make it platform independent and highly portable. A few lines of code in ArrayFire can replace dozens of lines of parallel computing code, saving you valuable time and lowering development costs.


AsciiDoc is a text document format for writing notes, documentation, articles, books, ebooks, slideshows, web pages, man pages and blogs. AsciiDoc files can be translated to many formats including HTML, PDF, EPUB, man page. AsciiDoc is highly configurable: both the AsciiDoc source file syntax and the backend output markups (which can be almost any type of SGML/XML markup) can be customized and extended by the user.

See also Magic-Book-Project.


AsciidocToGo is a full featured portable version of asciidoc that contains the complete toolchain to build html or docbook/latex based PDF documentation out of plain ascii txt files. Just download AsciidocToGo and start writing instead of seaching day or maybe weeks to put together all of the the required software parts.


Asciidoctor is a fast text processor and publishing toolchain for converting AsciiDoc content to HTML5, DocBook 5 (or 4.5) and other formats. The Asciidoctor project is an effort to bring a comprehensive and accessible publishing toolchain, centered around the AsciiDoc syntax, to a growing range of ecosystems, including Ruby, JavaScript and the JVM.

In addition to the standard AsciiDoc syntax, Asciidoctor recognizes additional markup and formatting options, such as font-based icons (e.g., fire) and UI elements (e.g., button:[Save]). Asciidoctor also offers a modern, responsive theme based on Foundation to style the HTML5 output.

In addition to an AsciiDoc processor and a collection of stylesheets, the project provides plugins for Maven, Gradle and Guard and packages for operating systems such as Fedora, Debian and Ubuntu. It also pushes AsciiDoc to evolve by introducing new ideas and innovation and helps promote AsciiDoc through education and advocacy.


An assortment of backends (i.e., templates) for Asciidoctor, a pure Ruby port of the AsciiDoc markup language. In this repository, you’ll find replicas of both the html5 and docbook45 backends from AsciiDoc (and Asciidoctor) written in both Haml and Slim, as well as backends for generating HTML5 presentations from AsciiDoc.


Asciidoctor EPUB3 is a set of Asciidoctor extensions for converting AsciiDoc to EPUB3 & KF8/MOBI.


A Gradle plugin that uses Asciidoctor via JRuby to process AsciiDoc source files within the project.


Asciidoctor LaTeX is a set of Asciidoctor extensions for converting AsciiDoc to LaTeX.


A native PDF renderer for AsciiDoc based on Asciidoctor and Prawn.


JavaScript port of Asciidoctor produced by Opal, a Ruby to JavaScript cross compiler.


DocGist is a URL proxy tool that converts AsciiDoc documents fetched from Gists, GitHub repositories, Dropbox folders and other sources to HTML. The conversion to HTML is performed in the browser (client-side) using the Asciidoctor.js JavaScript library. DocGist can render documents located anywhere, as long as the host permits cross-domain access.


The Magic Book Project is an open-source framework that facilitates the design and production of electronic and print books for authors. Rather than type into a word processor, the Magic Book Project allows an author to write a book once (using ASCIIDOC, a simple text document format) and procedurally generate the layout for a variety of formats using modern code-based design tools, such as CSS, the stylesheet standard. Write your book once, press a magic button, and out come multiple versions: printed hardcopy, digital PDF, HTML, MOBI, and EPUB.


The MPLW is Matplotlib (MPL) wrapper, which can work as AsciiDoc filter. Using this filter you can generate plots from inline matplotlib scripts.


A complete editor for structured text documents with proofreading features. RTextDoc is designed for typesetting professional research papers using LaTeX that are heavy on mathematics and images. In addition, it is designed for writing notes, books, ebooks, slideshows, web pages, man pages and blogs using AsciiDoc mark-up language. RTextDoc also supports DocBook.


Real-time collaborative editor for AsciiDoc file.


asciinema is a free and open source solution for recording the terminal sessions and sharing them on the web. When you run asciinema rec in your terminal the recording starts, capturing all output that is being printed to your terminal while you’re issuing the shell commands. When the recording finishes (by hitting Ctrl-D or typing exit) then the captured output is uploaded to website and prepared for playback on the web.


All three methods have been implemented in the new MAPLE package ASP (Automated Symmetry Package) which is an add-on to the MAPLE symmetry package DESOLVII (Vu, Jefferson and Carminati (2012) [25]). To our knowledge, this is the first computer package to automate all three methods of determining approximate symmetries for differential systems. Extensions to the theory have also been suggested for the third method and which generalise the first method to systems of differential equations. Finally, a number of approximate symmetries and corresponding solutions are compared with results in the literature.

assembler software

Software Optimization Resources -

flat assembler

A fast and efficient self-assembling x86 assembler for DOS, Windows and Linux operating systems. Currently it supports x86 and x86-64 instructions sets with MMX, 3DNow!, SSE up to SSE4, AVX, AVX2 and XOP extensions, can produce output in plain binary, MZ, PE, COFF or ELF format. It includes the powerful but easy to use macroinstruction support and does multiple passes to optimize the instruction codes for size. The flat assembler is self-compilable and the complete source code is included.


A library for working with assembler code. The HeavyThing library includes automatic profiling support, for both user and library code.


A simple to install and use toolsuite of command line applications for performance oriented programmers. It works for Intel and AMD processors on the Linux operating system.


A high-performance SIMD-optimized mathematical library for x86, ARM, and MIPS processors on Windows, Android, Mac OS X, and GNU/Linux systems. Yeppp officially supports the C, C++, C#, Java, and FORTRAN programming languages.


Assimulo is a simulation package for solving ordinary differential equations. It is written in the high-level programming language Python and combines a variety of different solvers written in FORTRAN, C and even Python via a common high-level interface. The primary aim of Assimulo is not to develop new integration algorithms. The aim is to provide a high-level interface for a wide variety of solvers, both new and old, solvers of industrial standards as well as experimental solvers. The aim is to allow comparison of solvers for a given problem without the need to define the problem in a number of different programming languages to accommodate the different solvers.

asynchronous communication


A simple, pure-Python solution for writing intelligible asynchronous socket applications. It uses PEP 342 coroutines to make concurrent I/O look and act like sequential programming. In this way, it is similar to the Greenlet green-threads library and its associated packages Eventlet and Gevent. Bluelet has a simpler, 100% Python implementation that comes at the cost of flexibility and performance when compared to Greenlet-based solutions. However, it should be sufficient for many applications that don’t need serious scalability; it can be thought of as a less-horrible alternative to asyncore or an asynchronous replacement for SocketServer (and more).


The greenlet package is a spin-off of Stackless, a version of CPython that supports micro-threads called “tasklets”. Tasklets run pseudo-concurrently (typically in a single or a few OS-level threads) and are synchronized with data exchanges on “channels”.

A “greenlet”, on the other hand, is a still more primitive notion of micro-thread with no implicit scheduling; coroutines, in other words. This is useful when you want to control exactly when your code runs. You can build custom scheduled micro-threads on top of greenlet; however, it seems that greenlets are useful on their own as a way to make advanced control flow structures. For example, we can recreate generators; the difference with Python’s own generators is that our generators can call nested functions and the nested functions can yield values too.

Greenlets are lightweight coroutines for in-process concurrent programming.


A concurrent networking library for Python that allows you to change how you run your code, not how you write it.


A coroutine-based Python networking library that uses greenlet to provide a high-level synchronous API on top of the libev event loop. Gevent is inspired by eventlet but features more consistent API, simpler implementation and better performance.


A socket library that provides several common communication patterns. It aims to make the networking layer fast, scalable, and easy to use. Implemented in C, it works on a wide range of operating systems with no further dependencies. The communication patterns, also called "scalability protocols", are basic blocks for building distributed systems. By combining them you can create a vast array of distributed applications.

Scalability protocols are layered on top of the transport layer in the network stack. At the moment, the nanomsg library supports INPROC, TCP and IPC.


An asynchronous event-driven network application framework for rapid development of maintainable high performance protocol servers & clients. Netty is a NIO client server framework which enables quick and easy development of network applications such as protocol servers and clients. It greatly simplifies and streamlines network programming such as TCP and UDP socket server.

Quick and easy doesn’t mean that a resulting application will suffer from a maintainability or a performance issue. Netty has been designed carefully with the experiences earned from the implementation of a lot of protocols such as FTP, SMTP, HTTP, and various binary and text-based legacy protocols. As a result, Netty has succeeded to find a way to achieve ease of development, performance, stability, and flexibility without a compromise.


Robust messaging for applications.


ZeroMQ (also known as ØMQ, 0MQ, or zmq) looks like an embeddable networking library but acts like a concurrency framework. It gives you sockets that carry atomic messages across various transports like in-process, inter-process, TCP, and multicast. You can connect sockets N-to-N with patterns like fan-out, pub-sub, task distribution, and request-reply. It’s fast enough to be the fabric for clustered products. Its asynchronous I/O model gives you scalable multicore applications, built as asynchronous message-processing tasks. It has a score of language APIs and runs on most operating systems.


An authentication and encryption protocol for ZeroMQ.


A text editor for the 21st century.


Authorea is the collaborative platform for research. Write and manage your technical documents in one place.


A universal cross assembler. The C compiler is used to process your source along with a set of macro definitions to do code generation for your specific processor.

Azimuth Project

The Azimuth Project is an international collaboration to create a focal point for scientists and engineers interested in saving the planet. Our goal is to make clearly presented, accurate information on the relevant issues easy to find, and to help people work together on our common problems.


Aztec is a library that provides algorithms for the iterative solution of large sparse linear systems arising in scientific and engineering applications. It is a stand-alone package comprising a set of iterative solvers, preconditioners and matrix-vector multiplication routines. Users are not required to provide their own matrix-vector multiplication routines or preconditioners in order to solve a linear system.

The Aztec library is written in C and is also callable from Fortran. Overall, the package was designed to be portable and easy to use. The user may input the linear system in a simple format and Aztec will perform the necessary transformations for the matrix-vector multiplication and preconditioning. After the transformations, the iterative solvers can run efficiently. If the input matrix is suitably partitioned, the efficiency can be further enhanced.

The major components of Aztec are implementations of iterative solvers (CG, CGS, BiCGSTAB, GMRES and TFQMR) and preconditioners (point Jacobi, block Jacobi, Gauss-Seidel, least-squares polynomials, and overlapping domain decomposition using sparse LU, ILU, ILUT, BILU and ICC within domains). Aztec supports two different sparse matrix notations: a) a point-entry modified sparse row (MSR) format; b) a block-entry variable block row (VBR) format. These two formats have been generalized for parallel implementation and the library includes highly optimized matrix-vector multiply kernels and preconditioners for both types of data structures.

See also AztecOO.


BaCon is a free BASIC to C translator for Unix-based systems, which runs on most Unix/Linux/BSD platforms, including MacOSX. It intends to be a programming aid in creating tools which can be compiled on different platforms (including 64bit environments), while trying to revive the days of the good old BASIC. BaCon can be described as a translator, a converter, a source-to-source compiler, a transcompiler or a transpiler. It also can be described as a very elaborate preprocessor to C.

BaCon is implemented in generic shell script and in itself. Therefore, to start using Bacon, the target system must have either Korn Shell, or ZShell, or Bourne Again Shell (BASH) available. If none of these shells are available, download and install the free Public Domain Korn Shell which can execute BaCon also. Furthermore, BaCon also works with a newer Kornshell implementation like the MirBSD Korn Shell. The shell script implementation can convert and compile the BaCon version of BaCon. This will deliver the binary version of BaCon which has an extremely high conversion performance. On newer systems, the average conversion rate usually lies above 10.000 lines per second. Code converted by BaCon can be compiled by GCC, the Compaq C Compiler, TCC, the clang/LLVM compiler and possibly also by other C compilers.


Baudline is a time-frequency browser designed for scientific visualization of the spectral domain. Signal analysis is performed by Fourier, correlation, and raster transforms that create colorful spectrograms with vibrant detail. Conduct test and measurement experiments with the built in function generator, or play back audio files with a multitude of effects and filters. The baudline signal analyzer combines fast digital signal processing, versatile high speed displays, and continuous capture tools for hunting down and studying elusive signal characteristics.


Beaker is a notebook-style development environment for working interactively with large and complex datasets. Its plugin-based architecture allows you to switch between languages or add new ones with ease, ensuring that you always have the right tool for any of your analysis and visualization needs.


Bertini is a general-purpose solver, written in C, that was created for research about polynomial continuation.

See also PHCpack.

BID Data Project

The BID Data Suite is a collection of hardware, software and design patterns that enable fast, large-scale data mining at very low cost. The software consists of two parts:

  • BIDMat, an interactive matrix library that integrates CPU and GPU acceleration and novel computational kernels.

  • BIDMach, a machine learning system that includes very efficient model optimizers and mixing strategies.

BIDMach is an interactive environment designed to make it extremely easy to build and use machine learning models. BIDMach includes core classes that take care of managing data sources, optimization and distributing data over CPUs or GPUs. It’s very easy to write your own models by generalizing from the models already included in the Toolkit.


This packages enables virtual arrays of arbitrary size, with arithmetic and statistical operations, and conversion to NumPy ndarrays. Virtual arrays can be stacked to increase their dimensionality, or tiled to increase their extent. Biggus includes support for easily wrapping data sources which produce NumPy ndarray objects via slicing, e.g. netcdf4python Variable instances, and NumPy ndarray instances. All operations are performed in a lazy fashion to avoid overloading system resources. Conversion to a concrete NumPy ndarray requires an explicit method call.


BigView allows for interactive panning and zooming of images of arbitrary size on desktop PCs running Linux. Additionally, it can work in a multi-screen environment where multiple PCs cooperate to view a single, large image. Using this software, one can explore — on relatively modest machines — images such as the Mars Orbiter Camera mosaic [92,160×33,280 pixels].

The images must be first converted into “paged” format, where the image is stored in 256×256 “pages” to allow rapid movement of pixels into texture memory. The format contains an “image pyramid”: a set of scaled versions of the original image. Each scaled image is 1/2 the size of the previous, starting with the original down to the smallest, which fits into a single 256×256 page.


A repository for Conda binaries, amongst other things.

Rich Signell’s Binstar -


The Basic Linear Algebra Subprograms (BLAS) are a specified set of low-level subroutines that perform common linear algebra operations such as copying, vector scaling, vector dot products, linear combinations, and matrix multiplication. They were first published as a Fortran library in 1979 and are still used as a building block in higher-level math programming languages and libraries.

See also ACML and OSKI.


Automatically Tuned Linear Algebra Software (ATLAS) is a software library for linear algebra. It provides a mature open source implementation of BLAS APIs for C and Fortran77. ATLAS is often recommended as a way to automatically generate an optimized BLAS library. While its performance often trails that of specialized libraries written for one specific hardware platform, it is often the first or even only optimized BLAS implementation available on new systems and is a large improvement over the generic BLAS available at Netlib. For this reason, ATLAS is sometimes used as a performance baseline for comparison with other products.

This site contains the official reference implementation of BLAS, from which all others have flowed. There are Fortran and C versions which should compile just about anywhere, although they are not optimized for a specific processor beyond the capabilities of the compiler used.


A software framework for instantiating high-performance BLAS-like dense linear algebra libraries. The BLAS-like Library Instantiation Software (BLIS) is a framework for rapid instantiation of high-performance libraries with Basic Linear Algebra Subprograms (BLAS) functionality.

Build to Order BLAS

The Build to Order BLAS system is a compiler that generates high-performance implementations of basic linear algebra kernels.

The term BLAS in the name is for Basic Linear Algebra Subprograms. The BLAS is a standard API for important linear algebra operations. The BLAS are implemented by most hardware vendors. Traditionally, each routine in the BLAS is implemented by hand by a highly skilled programmer. The Build to Order BLAS compiler automates the implementation of not only the BLAS standard but also any sequence of basic linear algebra operations.

The user of the Build to Order BLAS compiler writes down a specification for a sequence of matrix and vector operations together with a description of the input and output parameters. The compiler then tries out many different choices of how to implement, optimize, and tune those operations for the user’s computer hardware. The compiler choices the best option, which is output as a C file containing a function that implements the specified operations.


This repository houses the code for the OpenCL™ BLAS portion of clMath. The complete set of BLAS level 1, 2 & 3 routines is implemented. Please see Netlib BLAS for the list of supported routines. In addition to GPU devices, the library also supports running on CPU devices to facilitate debugging and multicore programming. APPML 1.10 is the most current generally available pre-packaged binary version of the library available for download for both Linux and Windows platforms.

The primary goal of clBLAS is to make it easier for developers to utilize the inherent performance and power efficiency benefits of heterogeneous computing. clBLAS interfaces do not hide nor wrap OpenCL interfaces, but rather leaves OpenCL state management to the control of the user to allow for maximum performance and flexibility. The clBLAS library does generate and enqueue optimized OpenCL kernels, relieving the user from the task of writing, optimizing and maintaining kernel code themselves.


clMath is the open-source project for OpenCL based BLAS and FFT libraries. The complete set of BLAS level 1, 2 & 3 routines is implemented.


The NVIDIA CUDA Basic Linear Algebra Subroutines (cuBLAS) library is a GPU-accelerated version of the complete standard BLAS library that delivers 6x to 17x faster performance than the latest MKL BLAS. New in CUDA 6.0 is multi-GPU support in cuBLAS-XT.


A set of routines which accelerate Level 3 BLAS (Basic Linear Algebra Subroutine) calls by spreading work across more than one GPU. By using a streaming design, cuBLAS-XT efficiently manages transfers across the PCI-Express bus automatically, which allows input and output data to be stored on the host’s system memory. This provides out-of-core operation – the size of operand data is only limited by system memory size, not by GPU on-board memory size.

Starting with CUDA 6.0, a free version of cuBLAS-XT is included in the CUDA toolkit as part of the cuBLAS library.


KBLAS (KAUST-BLAS) is a small open-source library that optimizes critical numerical kernels on CUDA-enabled GPUs. KBLAS provides a subset of standard BLAS functions. It also proposes some function with BLAS-like interface that target both single and multi- GPU systems.

The ultimate goal for KBLAS is performance. KBLAS has a set of tuning parameters that affect its performance according to the GPU architecture, and the CUDA runtime version. While we cannot guarantee optimal performance with the default tuning parameters, the user can easily edit such parameters on his local system. KBLAS might be shipped with autotuners in the future.


This C++ class library introduces Matrix, Vector, subMatrices, and LAStreams over the real domain. The library contains efficient and fool-proof implementations of level 1 and 2 BLAS (element-wise operations and various multiplications), transposition, determinant evaluation and matrix inverse. There are operations on a single row/col/diagonal of a matrix. Distinct features of the package are Matrix views, Matrix streams, and LazyMatrices. Lazy construction allows us to write matrix expressions in a natural way without introducing any hidden temporaries, deep copying, and any reference counting.


OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.


A high performance C++ implementation of BLAS (Basic Linear Subprograms). Standard conforming interfaces for C and Fortran are provided.


Blitz++ is a C++ class library for scientific computing which provides performance on par with Fortran 77/90. It uses template techniques to achieve high performance. It provides dense arrays and vectors, random number generators, and small vectors (useful for representing multicomponent or vector fields). It uses advanced C++ template metaprogramming techniques, including expression templates, to provide speed-optimized mathematical operations on sequences of data without sacrificing the natural syntax provided by other mathematical programming systems.


Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) is a package, written in C and MATLAB/OCTAVE, that includes an eigensolver implemented with the Locally Optimal Block Preconditioned Conjugate Gradient Method (LOBPCG). Its main features are: a matrix-free iterative method for computing several extreme eigenpairs of symmetric positive generalized eigenproblems; a user-defined symmetric positive preconditioner; robustness with respect to random initial approximations, variable preconditioners, and ill-conditioning of the stiffness matrix; and apparently optimal convergence speed.

BLOPEX supports parallel MPI-based computations. BLOPEX is incorporated in the HYPRE package and is available as an external block to the PETSc package. SLEPc and PHAML have interfaces to call BLOPEX eigensolvers.


A blocking, shuffling and loss-less compression library. Blosc is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call. Blosc is the first compressor (that I’m aware of) that is meant not only to reduce the size of large datasets on-disk or in-memory, but also to accelerate memory-bound computations (which is typical in vector-vector operations).


Command line interface to and serialization format for Blosc, a high performance, multi-threaded, blocking and shuffling compressor. Uses python-blosc bindings to interface with Blosc. Also comes with native support for efficiently serializing and deserializing Numpy arrays.


An open source library providing a Python interface to ndarrays backed by local or distributed implementations (currently targeting Spark. Designed to make working with big array data in Python as easy and seamless as in local settings, while exploiting the speed of proven distributed engines. Bolt exposes array operations through either local or distributed implementations with a common interface, and makes it easy to switch between them.


Boltons is a set of over 100 BSD-licensed, pure-Python utilities in the same spirit as — and yet conspicuously missing from — the standard library, including:

  • Atomic file saving, bolted on with fileutils

  • A highly-optimized OrderedMultiDict, in dictutils

  • Two types of PriorityQueue, in queueutils

  • Chunked and windowed iteration, in iterutils

  • A full-featured TracebackInfo type, for representing stack traces, in tbutils


A set of libraries for the C++ programming language that provide support for tasks and structures such as linear algebra, pseudorandom number generation, multithreading, image processing, regular expressions, and unit testing. It contains over eighty individual libraries.


Web sites are made of lots of things — frameworks, libraries, assets, utilities, and rainbows. Bower manages all these things for you.

Bower works by fetching and installing packages from all over, taking care of hunting, finding, downloading, and saving the stuff you’re looking for. Bower keeps track of these packages in a manifest file, bower.json. How you use packages is up to you. Bower provides hooks to facilitate using packages in your tools and workflows.

Bower is optimized for the front-end. Bower uses a flat dependency tree, requiring only one version for each package, reducing page load to a minimum.

A very useful thing is the search engine for packages that can be installed by Bower.


BRL-CAD is a powerful cross-platform open source solid modeling system that includes interactive geometry editing, high-performance ray-tracing for rendering and geometric analysis, image and signal-processing tools, a system performance analysis benchmark suite, libraries for robust geometric representation, with more than 20 years of active development.



A free and open source MPEG2 transport stream data generator and packet manipulator.


Very efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication (among and within files, including virtual machine images).


The Cactus Framework is an open-source, modular, portable programming environment for the collaborative development and deployment of scientific applications using high-performance computing. The name Cactus comes from the design of a central core ("flesh") which connects to application modules ("thorns") through an extensible interface. Thorns can implement custom developed scientific or engineering applications, such as computational fluid dynamics. Other thorns from a standard computational toolkit provide a range of computational capabilities, such as parallel I/O, data distribution, or checkpointing.

Cactus runs on many architectures. Applications, developed on standard workstations or laptops, can be seamlessly run on clusters or supercomputers. Cactus provides easy access to many cutting edge software technologies being developed in the academic research community, including the Globus Metacomputing Toolkit, HDF5 parallel file I/O, the PETSc scientific library, adaptive mesh refinement, web interfaces, and advanced visualization tools.

Cactus: Issues for Sustainable Simulation Software (online article) -



ImplicitCAD is a project dedicated to using the power of math and computer science to get stupid design problems out of the way of the 3D printing revolution.


OpenSCAD is a software for creating solid 3D CAD models. It is free software and available for Linux/UNIX, Windows and Mac OS X. Unlike most free software for creating 3D models (such as Blender) it does not focus on the artistic aspects of 3D modelling but instead on the CAD aspects. Thus it might be the application you are looking for when you are planning to create 3D models of machine parts but pretty sure is not what you are looking for when you are more interested in creating computer-animated movies.

OpenSCAD is not an interactive modeller. Instead it is something like a 3D-compiler that reads in a script file that describes the object and renders the 3D model from this script file. This gives you (the designer) full control over the modelling process and enables you to easily change any step in the modelling process or make designs that are defined by configurable parameters.

OpenSCAD provides two main modelling techniques: First there is constructive solid geometry (aka CSG) and second there is extrusion of 2D outlines. As data exchange format format for this 2D outlines Autocad DXF files are used. In addition to 2D paths for extrusion it is also possible to read design parameters from DXF files. Besides DXF files OpenSCAD can read and create 3D models in the STL and OFF file formats.


Calaos is a free software project (GPLv3) that lets you control and monitor your home. You can easily install and use it to transform your home into a smart home.


Cartopy is a Python package designed to make drawing maps for data analysis and visualisation as easy as possible. Cartopy makes use of the powerful PROJ.4, numpy and shapely libraries and has a simple and intuitive drawing interface to matplotlib for creating publication quality maps.

The features of Cartopy include:

  • object oriented projection definitions

  • point, line, vector, polygon and image transformations between projections

  • integration to expose advanced mapping in matplotlib with a simple and intuitive interface

  • powerful vector data handling by integrating shapefile reading with Shapely capabilities


It is already common for simulations to discard most of what they compute in order to minimize time spent on I/O. As we enter the exascale age the problem of scarce I/O capability continues to grow. Since storing data is no longer viable for many simulation applications, data analysis and visualization must now be performed in situ with the simulation to ensure that it is running smoothly and to fully understand the results that the simulation produces. Catalyst is a light-weight version of the ParaView server library that is designed to be directly embedded into parallel simulation codes to perform in situ analysis at run time.


A C and Fortran Interface to access Climate and NWP model Data. Supported data formats are GRIB, netCDF, SERVICE, EXTRA and IEG.


Software that enables our collaborators to easily harness large scale distributed systems such as clusters, clouds, and grids. We perform fundamental computer science research in that enables new discoveries through computing in fields such as physics, chemistry, bioinformatics, biometrics, and data mining. The tools are:

  • Parrot - Parrot is a tool for attaching existing programs to remote I/O systems through the filesystem interface. Parrot "speaks" a variety of remote I/O services include HTTP, FTP, GridFTP, iRODS, HDFS, XRootD, GROW, and Chirp on behalf of ordinary programs.

  • Chirp - A user-level file system for collaboration across distributed systems such as clusters, clouds, and grids. Chirp allows ordinary users to discover, share, and access storage, whether within a single machine room or over a wide area network.

  • Makeflow - A workflow engine for executing large complex workflows on clusters, clouds, and grids. Makeflow is very similar to traditional Make, so if you can write a Makefile, then you can write a Makeflow.

  • Work Queue - A framework for building large master-worker applications that span many computers including clusters, clouds, and grids. Work Queue applications are written in C, Perl, or Python using a simple API that allows users to define tasks, submit them to the queue, and wait for completion. Tasks are executed by a standard worker process that can run on any available machine. Each worker calls home to the master process, arranges for data transfer, and executes the tasks. The system handles a wide variety of failures, allowing for dynamically scalable and robust applications.

  • SAND - A set of modules for genome assembly that are built atop the Work

  • Queue platform for large-scale distributed computation on clusters, clouds,

  • or grids.


A large tool set for working on climate and NWP model data. NetCDF 3/4, GRIB 1/2 including SZIP and JPEG compression, EXTRA, SERVICE and IEG are supported as IO-formats. Apart from that CDO can be used to analyse any kind of gridded data not related to climate science. CDO has very small memory requirements and can process files larger than the physical memory.

configure --enable-cdi-lib --with-fftw3 --with-jasper=/usr/lib64
--with-libxml2=yes --with-udunits2=/usr/lib64 --with-curl=/usr/lib64
--with-proj=/usr/lib64 --with-netcdf=yes --with-hdf5=yes --with-szlib=yes
--with-threads=yes --with-grib-api=yes


configure: CDO is configured with the following options:
   "CC"                 : "gcc -std=gnu99",
   "CPP"                : "gcc -E",
   "CPPFLAGS"           : "-I/usr/lib64/include -I/usr/lib64/include -I/usr/lib64/include -I/usr/lib64/include -I/usr/include/libxml2",
   "CFLAGS"             : "-g -O2 -fopenmp ",
   "LDFLAGS"            : "-L/usr/lib64/lib -L/usr/lib64/lib  -L/usr/lib64/lib -L/usr/lib64/lib",
   "LIBS"               : "-lxml2 -ludunits2 -lcurl -lproj -lfftw3 -lgrib_api -ljasper -lnetcdf -lhdf5_hl -lhdf5 -lsz -lz  -lm ",
   "FCFLAGS"            : "",
   "INCLUDES"           : "@INCLUDES@",
   "LD"                 : "/usr/bin/ld -m elf_x86_64",
   "NM"                 : "/usr/bin/nm -B",
   "AR"                 : "ar",
   "AS"                 : "as",
   "DLLTOOL"            : "false",
   "OBJDUMP"            : "objdump",
   "STRIP"              : "strip",
   "RANLIB"             : "ranlib",
   "INSTALL"            : "/usr/bin/install -c",
   "cdi"                : {
     "enable_cdi_lib" : true
  "threads"    : {
    "lib"      : "",
    "include"  : ""
  "zlib"       : {
    "lib"      : " -lz",
  "szlib"      : {
    "lib"      : " -lsz",
    "include"  : ""
  "hdf5"       : {
    "lib"      : " -lhdf5",
    "include"  : ""
  "netcdf"     : {
    "lib"      : " -lnetcdf",
    "include"  : ""
  "udunits2"   : {
    "lib"      : " -L/usr/lib64/lib -ludunits2",
    "include"  : " -I/usr/lib64/include"
  "proj"       : {
    "lib"      : " -L/usr/lib64/lib -lproj",
    "include"  : " -I/usr/lib64/include"
  "USER_NAME"          : "baum",
  "HOST_NAME"          : "max",
  "SYSTEM_TYPE"        : "x86_64-unknown-linux-gnu"


Cdo{rb,py} allows you to use CDO in the context of Python and Ruby as if it would be a native library.

CDSC Mapper

The CDSC Mapper is a compiler package for heterogeneous mapping on various targets such as multi-core CPUs, GPUs and FPGAs. The objective is to provide the user with a complete compilation platform to ease the programming of complex heterogeneous devices, such as a Convey HC1-ex machine. The architecture of the compiler is based on a collection of production-quality compilers such as GNU GCC, Nvidia GCC and LLVM; two open-source compilation infrastructures on top of which development has been performed: the LLNL ROSE compiler and the LLVM project; and a collection of research compilers and runtime such as CnC-HC, PolyOpt and SDSLc.


The CEOP Satellite Data Server is actually a gateway with an OPeNDAP front end and the ability to access data via the OGC WCS protocol on the backend. Though originally developed for the Coordinated Enhanced Observing Period (CEOP) effort, it can be used with other WCS servers. It is implemented as a plug-in handler to the Hyrax server distributed by OPeNDAP


Cetus is a compiler infrastructure for the source-to-source transformation of software programs. It currently supports ANSI C. Since its creation in 2004, it has grown to over 80,000 lines of Java code, has been made available publicly on the web, and has become a basis for several research projects.

CFD Utilities

The CFD Utility Software Library (previously known as the Aerodynamics Division Software Library at NASA Ames Research Center) contains nearly 30 libraries of generalized subroutines and close to 100 applications built upon those libraries. These utilities have accumulated during four decades or so of software development in the aerospace field.

All are written in Fortran 90 or FORTRAN 77 with potential reuse in mind. The only exception is the C translations of a dozen or so numerics routines grouped as C_utilities.

David Saunders and Robert Kennelly are the primary authors, but miscellaneous contributions by others are gratefully acknowledged.

See 1-line summaries of the libraries and applications under the Files menu. Each library folder also contains 1-line summaries of the grouped subroutines, while each application folder contains READMEs adapted from the main program headers. NASA permission to upload actual software was granted on Jan. 24, 2014.


An I/O library for climate models, named CFIO(Climate Fast I/O). CFIO provides the same interface and feature as PnetCDF, and adopts an I/O forwarding technique to provide automatic overlapping of I/O with computing. CFIO performs better than PnetCDF in terms of decreasing the overall running time of the program.

This requires MPI, PnetCDF and pthreads.


A software project that provides easy access to efficient and reliable geometric algorithms in the form of a C++ library. CGAL is used in various areas needing geometric computation, such as geographic information systems, computer aided design, molecular biology, medical imaging, computer graphics, and robotics.

The library offers data structures and algorithms like triangulations, Voronoi diagrams, Boolean operations on polygons and polyhedra, point set processing, arrangements of curves, surface and volume mesh generation, geometry processing, alpha shapes, convex hull algorithms, shape analysis, AABB and KD trees.


The CGAL Bindings project allows to use some packages of CGAL, the Computational Algorithms Library, in languages other than C++, as for example Java and Python. The bindings are implemented with SWIG.


An emerging parallel programming language whose design and development are being led by Cray Inc. in collaboration with academia, computing centers, and industry. Chapel’s goal is to make parallel programming more productive, from high-end supercomputers to commodity clusters and multicore desktops and laptops. Chapel is being developed in an open-source manner at SourceForge and is released under the BSD license.

Chapel supports a multithreaded execution model via high-level abstractions for data parallelism, task parallelism, concurrency, and nested parallelism. Chapel’s locale type enables users to specify and reason about the placement of data and tasks on a target architecture in order to tune for locality. Chapel supports global-view data aggregates with user-defined implementations, permitting operations on distributed data structures to be expressed in a natural manner. In contrast to many previous higher-level parallel languages, Chapel is designed around a multiresolution philosophy, permitting users to initially write very abstract code and then incrementally add more detail until they are as close to the machine as their needs require. Chapel supports code reuse and rapid prototyping via object-oriented design, type inference, and features for generic programming.

Chapel was designed from first principles rather than by extending an existing language. It is an imperative block-structured language, designed to be easy to learn for users of C, C++, Fortran, Java, Python, Matlab, and other popular languages. While Chapel builds on concepts and syntax from many previous languages, its parallel features are most directly influenced by ZPL, High-Performance Fortran (HPF), and the Cray MTA™/Cray XMT™ extensions to C and Fortran.


A high-performance language interoperability tool that generates Babel-compatible bindings for the Chapel programming language. For details on using the command-line tool, please consult the BRAID man page and the Babel user’s guide.


This provides interoperability with Chapel in three forms:

  • Chapel code inlined in Python

  • Chapel code from source-files

  • Compile Chapel modules into Python modules


Chebfun is an open-source software system for numerical computing with functions. The mathematical basis of Chebfun is piecewise polynomial interpolation implemented with what we call “Chebyshev technology”. Chebfun has extensive capabilities for dealing with linear and nonlinear differential and integral operators, and it also includes continuous analogues of linear algebra notions like QR and singular value decomposition. The Chebfun2 extension works with functions of two variables defined on a rectangle in the x-y plane.


A Python implementation of Chebfun.


Interactive geometry software. Besides support for dynamic geometry, Cinderella.2 has many features that broaden the scope of the program to a wide variety of interaction scenarios. Compared to the old version of the program, two completely new parts were added: CindyLab, an environment for doing interactive physical experiments, and CindyScript, a high-level programming language that allows for fast, flexible and freely programmable interaction scenarios. Although each of the three parts of the program (geometry, physical simulation and scripting) can be used in a standalone manner, the programm unleashes its full power when all three parts are used in combination. They are designed to interact very smoothly.


Circos is a software package for visualizing data and information. It visualizes data in a circular layout — this makes Circos ideal for exploring relationships between objects or positions. There are other reasons why a circular layout is advantageous, not the least being the fact that it is attractive.

Circos is ideal for creating publication-quality infographics and illustrations with a high data-to-ink ratio, richly layered data and pleasant symmetries. You have fine control each element in the figure to tailor its focus points and detail to your audience.


CKAN is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data. CKAN is aimed at data publishers (national and regional governments, companies and organizations) wanting to make their data open and available.

CKAN is built with Python on the backend and Javascript on the frontend, and uses the Pylons web framework and SQLAlchemy as its ORM. Its database engine is PostgreSQL and its search is powered by SOLR. It has a modular architecture that allows extensions to be developed to provide additional features such as harvesting or data upload.

CKAN uses its internal model to store metadata about the different records, and presents it on a web interface that allows users to browse and search this metadata. It also offers a powerful API that allows third-party applications and services to be built around it.


This extension contains plugins that add geospatial capabilities to CKAN.


ClimatePipes uses a web-based application platform due to its widespread support on mainstream operating systems, ease-of-use, and inherent collaboration support. The front-end of ClimatePipes uses HTML5 (WebGL, CSS3) to deliver state-of-the-art visualization and to provide a best-in-class user experience. The back-end of the ClimatePipes is built using the Visualization Toolkit (VTK), Climate Data Analysis Tools (CDAT), and other climate and geospatial data processing tools such as GDAL and PROJ4.


CLFORTRAN is an open source (LGPL) Fortran module, designed to provide direct access to GPU, CPU and accelerator based computing resources available by the OpenCL standard.


clMAGMA is an OpenCL port of MAGMA. It supports AMD GPUs. The clMAGMA library dependancies, in particular optimized GPU OpenCL BLAS and CPU optimized BLAS and LAPACK for AMD hardware, can be found in the AMD Accelerated Parallel Processing Math Libraries (APPML).


Clojure is a dynamic programming language that targets the Java Virtual Machine (and the CLR, and JavaScript). It is designed to be a general-purpose language, combining the approachability and interactive development of a scripting language with an efficient and robust infrastructure for multithreaded programming. Clojure is a compiled language - it compiles directly to JVM bytecode, yet remains completely dynamic. Every feature supported by Clojure is supported at runtime. Clojure provides easy access to the Java frameworks, with optional type hints and type inference, to ensure that calls to Java can avoid reflection.

Clojure is a dialect of Lisp, and shares with Lisp the code-as-data philosophy and a powerful macro system. Clojure is predominantly a functional programming language, and features a rich set of immutable, persistent data structures. When mutable state is needed, Clojure offers a software transactional memory system and reactive Agent system that ensure clean, correct, multithreaded designs.


ClojureScript is a new compiler for Clojure that targets JavaScript. It is designed to emit JavaScript code which is compatible with the advanced compilation mode of the Google Closure optimizing compiler.


Immutant is an integrated suite of Clojure libraries. It represents an attempt to reduce the incidental complexity inherent in non-trivial applications. The services backed by the libraries include Undertow for web, HornetQ for messaging, Infinispan for caching, Narayana for transactions, and Quartz for scheduling.


Leiningen is the easiest way to use Clojure. With a focus on project automation and declarative configuration, it gets out of your way and lets you focus on your code.


Distributed, masterless, high performance, fault tolerant data processing for Clojure.


In one hand Quil holds Processing, a carefully crafted API for making drawing and animation extremely easy to get your biscuit-loving chops around. In the other she clutches Clojure, an interlocking suite of exquisite language abstractions forged by an army of hammocks and delicately wrapped in flowing silky parens of un-braided joy.

cluster management


Self-managed infrastructure, programmatic monitoring and orchestration. The circuit is a minimal distributed operating system that enables programmatic, reactive control over hosts, processes and connections within a compute cluster.


CLyther is a Python tool similar to Cython and PyPy. CLyther is a just-in-time specialization engine for OpenCL. The main entry points for CLyther are its clyther.task and clyther.kernel decorators. Once a function is decorated with one of these the function will be compiled to OpenCL when called.

CLyther is a Python language extension that makes writing OpenCL code as easy as Python itself. CLyther currently only supports a subset of the Python language definition but adds many new features to OpenCL.

CLyther exposes both the OpenCL C library as well as the OpenCL language to python.


CLUTO is a software package for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. CLUTO is well-suited for clustering data sets arising in many diverse application areas including information retrieval, customer purchasing transactions, web, GIS, science, and biology.

CLUTO’s distribution consists of both stand-alone programs and a library via which an application program can access directly the various clustering and analysis algorithms implemented in CLUTO.


gCLUTO is a cross-platform graphical application for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. gCLUTO is build on-top of the CLUTO clustering library.


wCLUTO is a web-enabled data clustering application that is designed for the clustering and data-analysis requirements of gene-expression analysis. wCLUTO is also built on top of the CLUTO clustering library. Users can upload their datasets, select from a number of clustering methods, perform the analysis on the server, and visualize the final results.


Intel Concurrent Collections for C++ is a C++ template library for letting C++ programmers implement CnC applications which run in parallel on shared and distributed memory.

CnC makes it easy to write C++ programs which take full advantage of the available parallelism. Whether run on multicore systems, Xeon Phi™ or clusters CnC will seamlessly exploit the performance potential of your hardware. Through its portabilty and composability (with itself and other tools) it provides future-proof scalability.


The CnC-Python system under development in the Habanero project at Rice University builds on past work on the Intel Concurrent Collections (CnC) and Habanero CnC projects.


COCO (COmparing Continuous Optimisers) is a platform for systematic and sound comparisons of real-parameter global optimisers. COCO provides benchmark function testbeds and tools for processing and visualizing data generated by one or several optimizers. The COCO platform has been used for the Black-Box-Optimization-Benchmarking (BBOB) workshops that took place during the GECCO conference in 2009, 2010, 2012, and 2013.


Code::Blocks is a free C, C++ and Fortran IDE built to meet the most demanding needs of its users. It is designed to be very extensible and fully configurable. An IDE with all the features you need, having a consistent look, feel and operation across platforms. Built around a plugin framework, Code::Blocks can be extended with plugins. Any kind of functionality can be added by installing/coding a plugin. For instance, compiling and debugging functionality is already provided by plugins.


Code_Saturne solves the Navier-Stokes equations for 2D, 2D-axisymmetric and 3D flows, steady or unsteady, laminar or turbulent, incompressible or weakly dilatable, isothermal or not, with scalars transport if required.

Several turbulence models are available, from Reynolds-Averaged models to Large-Eddy Simulation models. In addition, a number of specific physical models are also available as "modules": gas, coal and heavy-fuel oil combustion, semi-transparent radiative transfer, particle-tracking with Lagrangian modeling, Joule effect, electrics arcs, weakly compressible flows, atmospheric flows, rotor/stator interaction for hydraulic machines.


The numerical solution of partial differential equations using the finite element method is one of the key applications of high performance computing. Local assembly is its characteristic operation. This entails the execution of a problem-specific kernel to numerically evaluate an integral for each element in the discretized problem domain. Since the domain size can be huge, executing efficient kernels is fundamental. Their op- timization is, however, a challenging issue. Even though affine loop nests are generally present, the short trip counts and the complexity of mathematical expressions make it hard to determine a single or unique sequence of successful transformations. Therefore, we present the design and systematic evaluation of COF- FEE, a domain-specific compiler for local assembly kernels. COFFEE manipulates abstract syntax trees generated from a high-level domain-specific language for PDEs by introducing domain-aware composable optimizations aimed at improving instruction-level parallelism, especially SIMD vectorization, and register locality. It then generates C code including vector intrinsics.


A Pythonic package for combinatorics. Combi lets you explore spaces of permutations and combinations as if they were Python sequences, but without generating all the permutations/combinations in advance. It lets you specify a lot of special conditions on these spaces. It also provides a few more classes that might be useful in combinatorics programming.


COMCOT (Cornell Multi-grid Coupled Tsunami Model) is a tsunami modeling package, capable of simulating the entire lifespan of a tsunami, from its generation, propagation and runup/rundown in coastal regions.

Waves can be generated via incident wave maker, fault model, landslide, or even customized profile. Flexible nested grid setup allows for the balance between accuracy and efficiency.

command-line utilities

9 Linux Commands To Be Known To Secure Your Linux From Danger -


This utility gets to the bottom of what makes files or directories different. It will recursively unpack archives of many kinds and transform various binary formats into more human readable form to compare them. It can compare two tarballs, ISO images, or PDF just as easily.


A command-line fuzzy finder written in Go.


A cts like a very simple and configurable file browser/navigator for the command line by taking advantage of the general-purpose fuzzy finder fzf. Although coming without Miller columns, fzf-fs is inspired by tools like lscd and deer, which both follow the example set by ranger.


This is something the Unix toolkit always could have done, and arguably always should have done. It operates on key-value-pair data while the familiar Unix tools operate on integer-indexed fields: if the natural data structure for the latter is the array, then Miller’s natural data structure is the insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV. (Miller can handle positionally-indexed data as a special case.)

compressive sampling

Compressive sampling is a signal processing technique for efficiently acquiring and reconstructing a signal, by finding solutions to underdetermined linear systems. This is based on the principle that, through optimization, the sparsity of a signal can be exploited to recover it from far fewer samples than required by the Shannon-Nyquist sampling theorem. There are two conditions under which recovery is possible.[1] The first one is sparsity which requires the signal to be sparse in some domain. The second one is incoherence which is applied through the isometric property which is sufficient for sparse signals.


A fast and robust first-order method than solves basis-pursuit problems and a large number of extensions (including tv-denoising).


A set of Matlab templates, or building blocks, that can be used to construct efficient, customized solvers for a variety of convex models, including in particular those employed in sparse recovery applications.


ConicBundle is a callable library for C/C++ that implements a bundle method for minimizing the sum of convex functions that are given by first order oracles or arise from Lagrangean relaxation of particular conic linear programs.


A container (Linux Container) at its core is an allocation, portioning, and assignment of host (compute) resources such as CPU Shares, Network I/O, Bandwidth, Block I/O, and Memory (RAM) so that kernel level constructs may jail-off, isolate or “contain” these protected resources so that specific running services (processes) and namespaces may solely utilize them without interfering with the rest of the system. These processes could be lightweight Linux hosts based on a Linux image, multiple web severs and applications, a single subsystem like a database backend, to a single process such as ‘echo “Hello”’ with little to no overhead.

Commonly known as “operating system-level virtualization” or “OS Virtual Environments” containers differ from hypervisor level virtualization. The main difference is that the container model eliminates the hypervisor layer, redundant OS kernels, binaries, and libraries needed to typically run workloads in a VM.


An open source system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications.

Context Free Art

Context Free is a program that generates images from written instructions called a grammar. The program follows the instructions in a few seconds to create images that can contain millions of shapes. Chris Coyne created a small language for design grammars called CFDG. These grammars are sets of non-deterministic rules to produce images. The images are surprisingly beautiful, often from very simple grammars. Context Free is a full graphical environment for editing, rendering, and exploring CFDG design grammars.

See also Structure Synth.


A Fortran 90 library that provides functions to manage grids and aribirary sets of points, including interpolation and mapping between different coordinate systems.


An extended firmware platform that delivers a lightning fast and secure boot experience on modern computers and embedded systems. As an Open Source project it provides auditability and maximum control over technology.

Coreboot performs a little bit of hardware initialization and then executes additional boot logic, called a payload. With this separation of hardware initialization and later boot logic, coreboot can scale from specialized applications run directly from firmware, operating systems in flash, and custom bootloaders to implementations of firmware standards like PCBIOS and EFI without having to carry features not necessary in the target application, reducing the amount of code and flash space required.


Climate reconstruction software based on probability density function (PDF) methods. CREST implements a PDF-based method that takes into account the different climatic requirements of each species constituting the broader pollen type. PDFs are fitted in two successive steps, with parametric PDFs fitted first for each species and then a combination of those individual species PDFs into a broader single PDF to represent the pollen type as a unit. A climate value for the pollen assemblage is estimated from the likelihood function obtained after the multiplication of the pollen-type PDFs, with each being weighted according to its pollen percentage.


The toolbox contains MATLAB® routines for computing recurrence plots and related problems.


Cascading Style Sheets (CSS) is a style sheet language used for describing the look and formatting of a document written in a markup language. Although most often used to change the style of web pages and user interfaces written in HTML and XHTML, the language can be applied to any kind of XML document, including plain XML, SVG and XUL. Along with HTML and JavaScript, CSS is a cornerstone technology used by most websites to create visually engaging webpages, user interfaces for web applications, and user interfaces for many mobile applications.


Basscss is a lightweight collection of base element styles, immutable utilities, layout modules, and color styles designed for speed, clarity, performance, and scalability.


A free, open source compiler collection that combines multiple tools and techniques including MILEPOST GCC, ICI, CCC framework, cTuning web-services and Collective Optimization Database and cBench as the first practical step toward self-tuning, adaptive computing systems based on industrial tools, empirical techniques, transparent collective optimization, statistical analysis and machine learning. cTuning CC is a wrapper around any compiler such as GCC, LLVM, Open64, Path64, etc that can transparently invoke machine learning mode to correlate program features of a compiled program with the ones stored in the Collective Optimization Database and suggest better optimizations for multi-objective criteria such as improving execution time, compilation time, code size, etc (using optimization space frontier detection).


Cubica is a toolkit for efficient finite element simulations of deformable bodies containing both geometric and material non-linearities. Its main feature is its use of subspace methods, also known as dimensional model reduction or reduced order methods, which can accelerate simulations by several orders of magnitude.


CUDA (after the Plymouth Barracuda[1]), which stands for Compute Unified Device Architecture, is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce.[2] CUDA gives developers direct access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.

Using CUDA, the GPUs can be used for general purpose processing (i.e., not exclusively graphics); this approach is known as GPGPU. Unlike CPUs, however, GPUs have a parallel throughput architecture that emphasizes executing many concurrent threads slowly, rather than executing a single thread very quickly.

negativo17 RPM Repository for NVIDIA Drivers -


CUB provides state-of-the-art, reusable software components for every layer of the CUDA programming model.


CUDA cffi bindings and helper classes for Python.


A way to program NVIDIA graphical processors, from Fortran-9X and eventually Matlab.


The FortCUDA project seeks to generate CUDA bindings in F95/2003 using the Fortran 2003 ISO_C_BINDINGS module. It is intended to give near native call syntax to the CUDA SDK in Fortran 2003. Currently, most Fortran compilers are supporting the ISO_C_BINDINGS module.

The FortCUDA project mostly consists of a very basic module that contains appropriate bindings for most of the CUDA function calls found in cuda.h and cuda_runtime.h. This is not necessarily complete, but is quite comprehensive. The file parsing capability is provided as part of the distribution and has been used to generate wrappers around functions from a few projects. Specific to CUDA, wrappers have been generated for cuda.h and cuda_runtime.h and are included in the FortCUDA library. In addition Fortran modules have been generated for cublas.h and cufft.h, but are still being checked for accuracy. Theoretically the script can be used on any C header file, and, providing our grammars are adequate, will produce a 90% solution to wrapping the enums, strucs, and functions.


Gunrock is a CUDA library for graph primitives that refactors, integrates, and generalizes best-of-class GPU implementations of breadth-first search, connected components, and betweenness centrality into a unified code base useful for future development of high-performance GPU graph primitives.


CUDA C/C++ and the NVIDIA NVCC compiler toolchain support a number of features designed to make it easier to write portable code, including language integration of host and device code and data, declaration specifiers (e.g. host and device) and preprocessor definitions (CUDACC). These features combine to enable developers to write code that can be compiled and run on either the host, the device, or both. Other compilers don’t recognize these features, however, so to really write portable code, we need preprocessor macros. This is where Hemi comes in.

Modern GPU

Modern GPU is code and commentary intended to promote new and productive ways of thinking about GPU computing. is project is a library, an algorithms book, a tutorial, and a best-practices guide.


The rCUDA framework is the most modern remote GPU virtualization solution today. It enables the concurrent remote usage of CUDA-enabled devices in a transparent way. Thus, the source code of applications does not need to be modified in order to use remote GPUs but rCUDA takes care of all the necessary details. Furthermore, the overhead introduced by using a remote GPU is very small.

rCUDA provides full compatibility support with CUDA. It implements all of the functions in the CUDA Runtime API and Driver API, excluding only those related with graphics interoperability. It additionally includes highly optimized TCP and low-level InfiniBand pipelined communications as well as full multi-thread and multi-node capabilities. rCUDA targets the same Linux OS distributions as CUDA does, providing also support for x86 and ARM processor architectures. Furthermore, an integration of rCUDA with the SLURM scheduler has been developed, allowing your scheduled jobs to use remote GPUs. The combination of SLURM + rCUDA provides reductions in overall execution times of job batches between 25% and 45%, depending on the exact composition of the job batch.

It has been successfully tested with several applications including LAMMPS, WideLM, CUDASW++, HOOMD-blue, mCUDA-MEME, GPU-Blast, Gromacs, GAMESS, DL-POLY, and HPL.


NVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks. It emphasizes performance, ease-of-use, and low memory overhead. NVIDIA cuDNN is designed to be integrated into higher-level machine learning frameworks, such as UC Berkeley’s popular Caffe software. The simple, drop-in design allows developers to focus on designing and implementing neural net models rather than tuning for performance, while still achieving the high performance modern parallel computing hardware affords.


A better microcontroller IDE.


D4M is a breakthrough in computer programming that combines the advantages of five distinct processing technologies (sparse linear algebra, associative arrays, fuzzy algebra, distributed arrays, and triple-store/NoSQL databases such as Hadoop HBase and Apache Accumulo) to provide a database and computation system that addresses the problems associated with Big Data. D4M significantly improves search, retrieval, and analysis for any business or service that relies on accessing and exploiting massive amounts of digital data. Evaluations have shown D4M to simultaneously increase computing performance and to decrease the effort required to build applications by as much as 100x. Improved performance translates into faster, more comprehensive services provided by companies involved in healthcare, Internet search, network security, and more. Less, and simplified, coding reduces development times and costs. Moreover, the D4M layered architecture provides a robust environment that is adaptable to various databases, data types, and platforms.


A toolkit providing a flexible, extensible interface between analysis codes and iterative systems analysis methods. Dakota contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, stochastic expansion, and epistemic methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods.

Computational methods developed in structural mechanics, heat transfer, fluid mechanics, shock physics, and many other fields of engineering can be an enormous aid to understanding the complex physical systems they simulate. Often, it is desired to use these simulations as virtual prototypes to obtain an acceptable or optimized design for a particular system. Dakota seeks to enhance the utility of these computational methods by enabling their use as design tools, so that simulations may be used not just for single-point predictions, but also for automated determination of system performance improvements throughout the product life cycle.


Damaris is a middleware for I/O and data management targeting large-scale, MPI-based HPC simulations. It initially proposed to dedicate cores for asynchronous I/O in multicore nodes of recent HPC platforms, with an emphasis on ease of integration in existing simulation, efficient resource usage (with the use of shared memory) and simplicity of extension through plugins.

Over the years, Damaris has evolved into a more elaborate system, providing the possibility to use dedicated cores or dedicated nodes to data processing and I/O. It proposes a seamless connection to the VisIt software to enable in situ visualization with minimum impact on run time. Damaris provides an extremely simple API and can be easily integrated in existing large-scale simulations.


The goal of Damsel project is to enable Exascale computational science aplications to interact conveniently and efficiently with storage through abstractions that match their data models.


Dart is a cohesive, scalable platform for building apps that run on the web (where you can use Polymer) or on servers (such as with Google Cloud Platform). Use the Dart language, libraries, and tools to write anything from simple scripts to full-featured apps.


DART is a community facility for ensemble DA developed and maintained by the Data Assimilation Research Section (DAReS) at the National Center for Atmospheric Research (NCAR). DART provides modelers, observational scientists, and geophysicists with powerful, flexible DA tools that are easy to implement and use and can be customized to support efficient operational DA applications. DART is a software environment that makes it easy to explore a variety of data assimiliation methods and observations with different numerical models and is designed to facilitate the combination of assimilation algorithms, models, and real (as well as synthetic) observations to allow increased understanding of all three. DART includes extensive documentation, a comprehensive tutorial, and a variety of models and observation sets that can be used to introduce new users or graduate students to ensemble DA. DART also provides a framework for developing, testing, and distributing advances in ensemble DA to a broad community of users by removing the implementation-specific peculiarities of one-off DA systems.

DART employs a modular programming approach to apply an Ensemble Kalman Filter which nudges the underlying models toward a state that is more consistent with information from a set of observations. Models may be swapped in and out, as can different algorithms in the Ensemble Kalman Filter. The method requires running multiple instances of a model to generate an ensemble of states. A forward operator appropriate for the type of observation being assimilated is applied to each of the states to generate the model’s estimate of the observation.


DASH is a C++ Template Library for Distributed Data Structures with Support for Hierarchical Locality for HPC and Data-Driven Science.

Exascale systems are scheduled to become available in 2018-2020 and will be characterized by extreme scale and a multilevel hierarchical organization. Efficient and productive programming of these systems will be a challenge, especially in the context of data-intensive applications. Adopting the promising notion of Partitioned Global Address Space (PGAS) programming the DASH project develops a data-structure oriented C++ template library that provides hierarchical PGAS-like abstractions for important data containers (multidimensional arrays, lists, hash tables, etc.) and allows a developer to control (and explicitly take advantage of) the hierarchical data layout of global data structures. In contrast to other PGAS approaches such as UPC, DASH does not propose a new language or require compiler support to realize global address space semantics. Instead, operator overloading and other advanced C++ features are used to provide the semantics of data residing in a global and hierarchically partitioned address space based on a runtime system with one-sided messaging primitives provided by MPI or GASNet. As such, DASH can co-exist with parallel programming models already in widespread use (like MPI) and developers can take advantage of DASH by incrementally replacing existing data structures with the implementation provided by DASH. Efficient I/O directly to and from the hierarchical structures and DASH-optimized algorithms such as map-reduce are also part of the project. Two applications from molecular dynamics and geoscience are driving the project and are adapted to use DASH in the course of the project.


Dask enables parallel computing through task scheduling and blocked algorithms. This allows developers to write complex parallel algorithms and execute them in parallel either on a modern multi-core machine or on a distributed cluster.

On a single machine dask increases the scale of comfortable data from fits-in-memory to fits-on-disk by intelligently streaming data from disk and by leveraging all the cores of a modern CPU.


Dat is an open source project that provides a streaming interface between every file format and data storage backend.


DataHub is a unified, managed, collaborative platform for making data-processing easy. Relational databases have limited support for data collaboration, where teams collaboratively curate and analyze large datasets. Inspired by software version control systems like git, we propose (a) a dataset version control system, giving users the ability to create, branch, merge, difference and search large, divergent collections of datasets, and (b) a platform, DataHub, that gives users the ability to perform collaborative data analysis building on this version control system.

Data Transfer and Storage


A scalable data transfer management tool for GridFTP? transfer protocol. The goal is to manage as much as 1+ PB with millions of files transfers reliably.


BeStMan is a full implementation of SRM v2.2, developed by Lawrence Berkeley National Laboratory, for disk based storage systems and mass storage systems such as HPSS. End users may have their own personal BeStMan that manages and provides an SRM interface to their local disks or storage systems. It works on top of existing disk-based unix file system, and has been reported so far to work on file systems such as NFS, PVFS, AFS, GFS, GPFS, PNFS, and Lustre. It also works with any existing file transfer service, such as gsiftp, http, https and ftp. It requires the minimal administrative efforts on the deployment and maintenance.


CASTOR, stands for the CERN Advanced STORage manager, is a hierarchical storage management (HSM) system developed at CERN used to store physics production files and user files. Files can be stored, listed, retrieved and accessed in CASTOR using command line tools or applications built on top of the different data transfer protocols like RFIO (Remote File IO), ROOT libraries, GridFTP and XROOTD. CASTOR manages disk cache(s) and the data on tertiary storage or tapes. Currently (2007) there are some 60 million files and about 7 petabyte of data in CASTOR.

CASTOR provides a UNIX like directory hierarchy of file names. The directories are always rooted /castor/ (the will be different in other CASTOR sites). The CASTOR name space can viewed and manipulated only through CASTOR client commands and library calls. OS commands like ls or mkdir will not work on CASTOR files. The CASTOR name space holds permanent tape residence of the CASTOR files, while the more volatile disk residence is only known to the stager, which is the disk cache management component in CASTOR. When accessing or modifying a CASTOR file, one must therefore always use a stager.


The DaviX project aims to provide a solution for optimized remote I/O, data management and large collections of file management over the WebDav (link is external), Amazon S3 (link is external) and HTTP (link is external) protocols. Davix is Multi-plateform, Open Source and is written in C++.


A system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods. Depending on the Persistency Model, dCache provides methods for exchanging data with backend (tertiary) Storage Systems as well as space management, pool attraction, dataset replication, hot spot determination and recovery from disk or node failures. Connected to a tertiary storage system, the cache simulates unlimited direct access storage space.


The dCache dccp client.


DataMover-Lite (DML) is a simple file transfer tool with graphical user interface which supports multi-protocol data movement.DML is available in both webstart and standalone version. Currently, DML supports http, https, ftp, gridftp, lahfs and scp. For GridFTP, DML also supports directory browsing and transferring.


The Disk Pool Manager (DPM) is a lightweight storage solution for grid sites. It offers a simple way to create a disk-based grid storage element and supports relevant protocols (SRM, gridFTP, RFIO) for file management and access. It focus on manageability (ease of installation, configuration, low effort of maintenance), while providing all required functionality for a grid storage solution (support for multiple disk server nodes, different space types, multiple file replicas in disk pools).


GridFTP is a high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks. The GridFTP protocol is based on FTP, the highly-popular Internet file transfer protocol. We have selected a set of protocol features and extensions defined already in IETF RFCs and added a few additional features to meet requirements from current data grid projects.

GridFTP also addresses the problem of incompatibility between storage and access systems. Previously, each data provider would make their data available in their own specific way, providing a library of access functions. This made it difficult to obtain data from multiple sources, requiring a different access method for each, and thus dividing the total available data into partitions. GridFTP provides a uniform way of accessing the data, encompassing functions from all the different modes of access, building on and extending the universally accepted FTP standard. FTP was chosen as a basis for it because of its widespread use, and because it has a well defined architecture for extensions to the protocol (which may be dynamically discovered).


An open source utility that provides fast incremental file transfer.


For transferring large, deep file trees, rsync will pause while it generates lists of files to process. Since Version 3, it does this pretty fast, but on sluggish filesystems, it can take hours or even days before it will start to actually exchange rsync data.

Second, due to various bottlenecks, rsync will tend to use less than the available bandwidth on high speed networks. Starting multiple instances of rsync can improve this significantly. However, on such transfers, it is also easy to overload the available bandwidth, so it would be nice to both limit the bandwidth used if necessary and also to limit the load on the system.

Parsyncfp tries to satisfy all these conditions and more by:

  • using the fpart file partitioner which can produce lists of files very rapidly

  • allowing re-use of the cache files so generated

  • doing crude loadbalancing of the number of active rsyncs, suspending and unsuspending the processes as necessary

  • using rsync’s own bandwidth limiter (--bwlimit) to throttle the total bandwidth

  • using rsync’s own vast option selection is available as a pass-thru


SRM-Lite is a simple command-line based tool with pluggable file transfer protocol supports. SRM-Lite supports scp and sftp in high performance way (hpn-ssh).


Unison is a file-synchronization tool for OSX, Unix, and Windows. It allows two replicas of a collection of files and directories to be stored on different hosts (or different disks on the same host), modified separately, and then brought up to date by propagating the changes in each replica to the other. Unison shares a number of features with tools such as configuration management packages (CVS, PRCS, Subversion, BitKeeper, etc.), distributed filesystems (Coda, etc.), uni-directional mirroring utilities (rsync, etc.), and other synchronizers (Intellisync, Reconcile, etc).


The XROOTD project aims at giving high performance, scalable fault tolerant access to data repositories of many kinds. The typical usage is to give access to file-based ones. It is based on a scalable architecture, a communication protocol, and a set of plugins and tools based on those. The freedom to configure it and to make it scale (for size and performance) allows the deployment of data access clusters of virtually any size, which can include sophisticated features, like authentication/authorization, integrations with other systems, WAN data distribution, etc.

XRootD software framework is a fully generic suite for fast, low latency and scalable data access, which can serve natively any kind of data, organized as a hierarchical filesystem-like namespace, based on the concept of directory. As a general rule, particular emphasis has been put in the quality of the core software parts.


The DaviX project aims to provide a solution for optimized remote I/O, data management and large collections of file management over the WebDav (link is external), Amazon S3 (link is external) and HTTP (link is external) protocols. Davix is Multi-plateform, Open Source and is written in C++.

It is composed of two components:

  • libdavix: a C++ library. it offers an HTTP API, a remote I/O API and a POSIX compatibility layer.

  • davix-*: several utilities for file transfert, large collections of files management and large files management.

DaviX supports features like session reuse, redirection caching, vector operations, Metalink, X509 client certificate, proxy certificate, SOCKS4/5 or VOMS.

Dax Toolkit

The Dax Toolkit supports the fine-grained concurrency for data analysis and visualization algorithms required to drive exascale computing. The basic computational unit of the Dax Toolkit is a worklet, a function that implements the algorithm’s behavior on an element of a mesh (that is, a point, edge, face, or cell) or a small local neighborhood. The worklet is constrained to be serial and stateless; it can access only the element passed to and from the invocation. With this constraint, the serial worklet function can be concurrently executed on an unlimited number of threads without the complications of memory clashes or other race conditions.

The Dax Toolkit provides dispatchers that apply worklets to all elements in an input mesh, the results of which are collected into a resulting mesh. Although worklets are not allowed communication, many visualization algorithms require operations such as variable array packing and coincident topology resolution that intrinsically require significant coordination among threads. Dax enables such algorithms by classifying and implementing the most common and versatile communicative operations, which, when used in conjunction with the appropriate worklets, complete the visualization algorithms.


DCCRG is an easy to use grid for FVM/FEM simulations written in C++. It handles load balancing and neighbour cell data updates between processes automatically. MPI is used for parallelization.

The distributed cartesian cell-refinable grid (dccrg) supports adaptive mesh refinement and allows an arbitrary C++ class to be used as cell data. The amount of data in grid cells can vary both in space and time allowing dccrg to be used in very different types of simulations, for example in fluid and particle codes. Dccrg transfers the data between neighboring cells on different processes transparently and asynchronously allowing one to overlap computation and communication. This enables excellent scalability at least up to 32 k cores in magnetohydrodynamic tests depending on the problem and hardware. In the version of dccrg presented here part of the mesh metadata is replicated between MPI processes reducing the scalability of adaptive mesh refinement (AMR) to between 200 and 600 processes.


DGGRID is a public domain software program for creating and manipulating Discrete Global Grids. A Discrete Global Grid (DGG) consists of a set of regions that form a partition of the Earth’s surface, where each region has a single point contained in the region associated with it. Each region/point combination is a called a cell. Depending on the application, data objects or values may be associated with the regions, points, or cells of a DGG. A Discrete Global Grid System (DGGS) is a series of discrete global grids, usually consisting of increasingly finer resolution grids (though the term DGG is often used interchangeably with the term DGGS).


We introduce the Declaratron, a system which takes a declarative approach to specifying mathematically based scientific computation. This uses displayable mathematical notation (Content MathML) and is both executable and semantically well defined. We combine domain specific representations of physical science (e.g. CML, Chemical Markup Language), MathML formulae and computational specifications (DeXML) to create executable documents which include scientific data and mathematical formulae. These documents preserve the provenance of the data used, and build tight semantic links between components of mathematical formulae and domain objects---in effect grounding the mathematical semantics in the scientific domain.


Noise reduction (Wikipedia) -

Video denoising (Wikipedia) -

A review of image denoising algorithms (paper, PDF, 2005, 41) - A. Buades et al. -

Is denoising dead? (paper, PDF, 2010, 17) - Priyam Chatterjee & Peyman Milanfar -


A denoising algorithm seeks to remove perturbations or errors from a signal. The last three decades have seen extensive research devoted to this arena, and as a result, today’s denoisers are highly optimized algorithms that effectively remove large amounts of additive white Gaussian noise. A compressive sensing (CS) reconstruction algorithm seeks to recover a structured signal acquired using a small number of randomized measurements. Typical CS reconstruction algorithms can be cast as iteratively estimating a signal from a perturbed observation. This paper answers a natural question: How can one effectively employ a generic denoiser in a CS reconstruction algorithm? In response, in this paper, we develop a denoising-based approximate message passing (D-AMP) algorithm that is capable of high-performance reconstruction. We demonstrate that, for an appropriate choice of denoiser, D-AMP offers state-of-the-art CS recovery performance for natural images. We explain the exceptional performance of D-AMP by analyzing some of its theoretical features. A critical insight in our approach is the use of an appropriate Onsager correction term in the D-AMP iterations, which coerces the signal perturbation at each iteration to be very close to the white Gaussian noise that denoisers are typically designed to remove.

This packages contains the code to run the BM3D, BM3D-SAPCA, BLS-GSM, and NLM variants of the denoising-based approximate message passing and denoising-based iterative thresholding algorithms.


In this paper it is suggested that a stochastic isotropic diffusive process, representing a spatial first order auto regressive process (AR(1)-process), can be used as a null hypothesis for the spatial structure of climate variability. By comparing the leading empirical orthogonal functions (EOFs) of a fitted null hypothesis with EOF modes of an observed data set, inferences about the nature of the observed modes can be made. The concept and procedure of fitting the null hypothesis to the observed EOFs is in analogy to time analysis, where an AR(1)-process is fitted to the statistics of the time series in order to evaluate the nature of the time scale behavior of the time series. The formulation of a stochastic null hypothesis allows one to define teleconnection patterns as those modes that are most distinguished from the stochastic null hypothesis. The method is applied to several artificial and real data sets including the sea surface temperature of the tropical Pacific and Indian Ocean and the Northern Hemisphere wintertime and tropical sea level pressure.

A Matlab script for computing the Distinct EOFs is available.


DEPOT is a framework for easily storing and serving files in web applications on Python2.6+ and Python3.2+. Modern web applications need to rely on a huge amount of stored images, generated files and other data which is usually best to keep outside of your database. DEPOT provides a simple and effective interface for storing your files on a storage backend at your choice (Local, S3, GridFS) and easily relate them to your application models (SQLAlchemy, Ming) like you would for plain data.


Delite is a research project from Stanford University’s Pervasive Parallelism Laboratory (PPL). Delite is a compiler framework and runtime for parallel embedded domain-specific languages (DSLs). Our goal is enable the rapid construction of high performance, highly productive DSLs.

Delite is still in alpha, and there is no official release. However, the develop (Delite) and delite-develop (LMS) branches should be relatively stable for experimental development of new DSLs. For those interested in developing their own DSLs, we highly recommend using Forge, which is itself a DSL that automates much of the process of creating DSLs embedded in Scala. For those interested in using instead of building DSLs, alpha builds of OptiML, a DSL for machine learning, OptiQL, a DSL for data querying, and OptiGraph, a DSL for graph analytics, are currently available.


OptiML is an embedded domain-specific language for machine learning. OptiML is developed as a research project from Stanford University’s Pervasive Parallelism Laboratory (PPL).

OptiML is currently targeted at machine learning researchers and algorithm developers; it aims to provide a productive, high performance, MATLAB-like environment for linear algebra supplemented with machine learning specific abstractions. Our primary goal is to allow machine learning practitioners to write code in a highly declarative manner and still achieve high performance on a variety of underlying parallel, heterogeneous devices. The same OptiML program should run well and scale on a CMP (chip multi-processor), a GPU, a combination of CMPs and GPUs, clusters of CMPs and GPUs, and eventually even FPGAs and other specialized accelerators.

In particular, OptiML is designed to allow statistical inference algorithms expressible by the Statistical Query Model to be both easy to express and very fast to execute. These algorithms can be expressed in a summation form, and can be parallelized using fine-grained map-reduce operations. OptiML employs aggressive optimizations to reduce unnecessary memory allocations and fuse operations together to make these as fast as possible. OptiML also attempts to specialize implementations to particular hardware devices as much as possible to achieve the best performance.


A prototype meta DSL that generates Delite DSL implementations from a specification-like program.


Codes for detrending Kepler and other light curves. To study exoplanetary atmospheres, we typically require a 10e-4 to 10e-5 level of accuracy in flux. Achieving such a precision has become the central challenge to exoplanetary research and is often impeded by systematic (nongaussian) noise from either the instrument, stellar activity or both. Dedicated missions, such as Kepler, feature an a priori instrument calibration plan to the required accuracy but nonetheless remain limited by stellar systematics. More generic instruments often lack a sufficiently defined instrument response function, making it very hard to calibrate. The correct calibration strategy is hence of paramount importance and requires a dedicated effort and out of the box thinking. In recent years, we have made significant advances in exoplanetary spectroscopy through improvements in data de-trending.

See xref:PyKE.


DistAlgo is a very high-level language for programming distributed algorithms. This project implements a DistAlgo compiler with Python as the target language. In the following text, the name DistAlgo refers to the compiler and not the language.

Distributed Array Protocol

The Distributed Array Protocol (DAP) is a process-local protocol that allows two subscribers, called the “producer” and the “consumer” or the “exporter” and the “importer”, to communicate the essential data and metadata necessary to share a distributed-memory array between them. This allows two independently developed components to access, modify, and update a distributed array without copying. The protocol formalizes the metadata and buffers involved in the transfer, allowing several distributed array projects to collaborate, facilitating interoperability. By not copying the underlying array data, the protocol allows for efficient sharing of array data.


A tool for multidimensional variational analysis (divand) is presented. It allows the interpolation and analysis of observations on curvilinear orthogonal grids in an arbitrary high dimensional space by minimizing a cost function. This cost function penalizes the deviation from the observations, the deviation from a first guess and abruptly varying fields based on a given correlation length (potentially varying in space and time). Additional constraints can be added to this cost function such as an advection constraint which forces the analysed field to align with the ocean current. The method decouples naturally disconnected areas based on topography and topology.


D-LITe is an universal architecture for building simple application over heterogenous Sensors Networks.


Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications. Consisting of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows, Docker enables apps to be quickly assembled from components and eliminates the friction between development, QA, and production environments. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud.


A package for deploying and managing Docker containers. Project Atomic integrates the tools and patterns of container-based application and service deployment with trusted operating system platforms to deliver an end-to-end hosting architecture that’s modern, reliable, and secure.

An Atomic Host is a lean operating system designed to run Docker containers, built from upstream CentOS, Fedora, or Red Hat Enterprise Linux RPMs. It provides all the benefits of the upstream distribution, plus the ability to perform atomic upgrades and rollbacks.

Docker Machine

A tool that makes it really easy to go from “zero to Docker”. Machine creates Docker Engines on your computer, on cloud providers, and/or in your data center, and then configures the Docker client to securely talk to them.


A complete infrastructure platform for running Docker in production.


An innovative feature of DOpElib is to provide a software toolkit to solve forward PDE problems as well as optimal control problems constrained by PDE. DOpElib concentrates on a unified approach for both linear and nonlinear problems by interpreting every PDE problem as nonlinear and applying a Newton method to solve it. The focus is on the numerical solution of both stationary and nonstationary problems which come from diㄦent application fields, like elasticity and plasticity, uid dynamics, and multiphysics problems such as uid-structure interactions.


DSCPACK can be used to solve sparse linear systems using direct methods on multiprocessors and networks-of-workstations. This package is suitable for systems where the coefficient matrix is symmetric and sparse. This solver is written in C; it uses MPI for inter-processor communication and the BLAS library for improved cache-performance.


A software experimental platform, providing the foundations needed to develop dedicated modular applications aggregating functionalities embedded using low-level and interchangeable software entities - plugins - and orchestrated through high-level software entities - scripts or GUIs.


Platform for multi-physics simulations built on top of dtk.


DxTer is a system for researching Design by Transformation (DxT). DxTer takes as input a knowledge base of software design transformations and a graph representing functionality to be implemented. It generates a search space of implementations from the transformations, estimates their costs, and outputs the best code.

DxT can be used as a principled way to re-engineer legacy applications or to forward engineer new applications. (The former is used for domains containing a single application, whereas the latter is used when transformations describe a combinatorial number of derivable applications). Our primary application is to forward engineer dense linear algebra applications using the Flame methodology and Elemental library. Our goal is to express the design knowledge of Elemental as transformations, and to generate Elemental libraries for new architectures, rather than hand-deriving such libraries. Doing so will be a significant accomplishment — both in software engineering in general, and dense linear algebra in particular.


DZSlides is a one-page-template to build your presentation in HTML5 and CSS3.

Earth Orbit

An astronomically precise and accurate model that offers 3-D visualizations of Earth’s orbital geometry, Milankovitch parameters and the ensuing insolation forcing. The model is developed in MATLAB® as a user-friendly graphical user interface. Users are presented with a choice between the Berger (1978a) and Laskar et al. (2004) astronomical solutions for eccentricity, obliquity and precession. A "demo" mode is also available, which allows the Milankovitch parameters to be varied independently of each other, so that users can isolate the effects of each parameter on orbital geometry, the seasons, and insolation. A 3-D orbital configuration plot, as well as various surface and line plots of insolation and insolation anomalies on various time and space scales are produced. Insolation computations use the model’s own orbital geometry with no additional a priori input other than the Milankovitch parameter solutions.


EAVL is the Extreme-scale Analysis and Visualization Library.


An integrated development environment (IDE). It contains a base workspace and an extensible plug-in system for customizing the environment. Written mostly in Java, Eclipse can be used to develop applications. By means of various plug-ins, Eclipse may also be used to develop applications in other programming languages: Ada, ABAP, C, C++, COBOL, Fortran, Haskell, JavaScript, Lasso, Lua, Natural, Perl, PHP, Prolog, Python, R, Ruby (including Ruby on Rails framework), Scala, Clojure, Groovy, Scheme, and Erlang. It can also be used to develop packages for the software Mathematica. Development environments include the Eclipse Java development tools (JDT) for Java and Scala, Eclipse CDT for C/C++ and Eclipse PDT for PHP, among others.


Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.


A new tool for recording and replaying energy harvesting conditions. Energy harvesting is a necessity for many small, embedded sensing devices, that must operate maintenance-free for long periods of time. However, understanding how the environment changes and it’s effects on device behavior has always been a source of frustration. Ekho allows system designers working with ultra low power devices, to realistically predict how new hardware and software configurations will perform before deployment. By taking advantage of electrical characteristics all energy sources share, Ekho is able to emulate many different energy sources (e.g., Solar, RF, Thermal, and Vibrational) and takes much of the guesswork out of experimentation with tiny, energy harvesting sensing systems.


ELCIRC is an unstructured-grid model designed for the effective simulation of 3D baroclinic circulation across river-to-ocean scales. It uses a finite-volume/finite-difference Eulerian-Lagrangian algorithm to solve the shallow water equations, written to realistically address a wide range of physical processes and of atmospheric, ocean and river forcings. The numerical algorithm is low-order, but volume conservative, stable and computationally efficient. It also naturally incorporates wetting and drying of tidal flats. While originally developed to meet specific modeling challenges for the Columbia River, ELCIRC has been extensively tested against standard ocean/coastal benchmarks, and is starting to be applied to estuaries and continental shelves around the world.


A functional reactive language for interactive applications. Elm is great for 2D and 3D games, diagrams, widgets, and websites.


The eLesson Markup Language (eLML) is an testopen source XML framework for creating structured eLessons using XML. For easier lesson authoring eLML we offer the web-based WYSIWYG Firedocs eLML Editor and to create eLML template layouts withouth any XSLT-knowledge you can use our new Template Builder. Once you created your eLML-lesson you can transform it into many different output-formats like IMS Content Package or SCORM, various HTML-templates, eBooks (ePub format), PDF, Office-Document (ODF) and many more listed under "Output Formats".

El Topo

El Topo is a public domain C++ package for tracking dynamic surfaces represented as triangle meshes in 3D. It robustly handles topology changes such as merging and pinching off, while adaptively maintaining a tangle-free, high-quality triangulation.

The current release contains source for the El Topo library, as well as Talpa, an executable demonstrating several applications of our method. The code has been tested on OS/X and Linux and is freely available for download.


Embree is a collection of high-performance ray tracing kernels, developed at Intel. The target user of Embree are graphics application engineers that want to improve the performance of their application by leveraging the optimized ray tracing kernels of Embree. The kernels are optimized for photo-realistic rendering on the latest Intel® processors with support for SSE, AVX, AVX2, and the 16-wide Intel® Xeon Phi™ coprocessor vector instructions. Embree supports runtime code selection to choose the traversal and build algorithms that best matches the instruction set of your CPU. We recommend using Embree through its API to get the highest benefit from future improvements. Embree is released as Open Source under the Apache 2.0 license.


EMPIRE is the name given to a way of changing the source code of a dynamical model so that it can interface with sequential data assimilation methods.

EMPIRE should be one of the quickest and easiest ways in which to modify the source code of the model to use data assimilation.


Emscripten is an LLVM-based project that compiles C and C++ into highly-optimizable JavaScript in asm.js format. This lets you run C and C++ on the web at near-native speed, without plugins.



A lightweight multi-platform, multi-architecture CPU emulator framework. The features include:

  • Multi-architectures: Arm, Arm64 (Armv8), M68K, Mips, PowerPC, Sparc, & X86 (include X86_64);

  • Clean/simple/lightweight/intuitive architecture-neutral API;

  • Support fine-grained instrumentation at various levels;

  • Implemented in pure C language, with bindings for Python available;

  • Native support for Windows & *nix (with Mac OSX, iOS, Android, Linux, *BSD & Solaris confirmed);

  • Thread-safe by design.


Equelle is a domain-specific language for the specification of simulators for systems of PDEs through a high-level syntax. The language allows the user to focus on equations and numerics while hiding the low-level details of software and hardware implementations.


The Federation of Earth Science Information Partners (ESIP) is a broad-based, distributed community of data and information technology practitioners.



The Common Information Model (CIM) is a metadata standard used by the climate research community and others to describe the artifacts and processes they work with. This includes climate simulations, the specific model components used to run those simulations, the datasets generated by those components, the geographic grids upon which those components and data are mapped, the computing platforms used, and so on.


Earth System CoG is a web environment that enables users to create project workspaces, connect projects into networks, share and consolidate information within those networks, and seamlessly link to tools for data archival, reformatting and search, data visualization, and metadata collection and display. CoG is integrated with the Earth System Grid Federation (ESGF) data distribution software and provides an easy to use interface to its services.


Cupid is a development and training environment for models that use the Earth System Modeling Framework (ESMF) and National Unified Operational Capability (NUOPC) Layer infrastructure. Cupid is implemented as a plug-in for the widely used Eclipse Integrated Development Environment (IDE). Together, Cupid and Eclipse form an accessible, appealing training environment that makes it easier and faster to build NUOPC-based applications.


ES-DOC is an international effort to develop tools to describe Earth system models in order to better understand and utilize model data. The tools are based on the Common Information Model (CIM) standard.


The Earth System Grid Federation (ESGF) Peer-to-Peer (P2P) enterprise system is a collaboration that develops, deploys and maintains software infrastructure for the management, dissemination, and analysis of model output and observational data.

SMF Web Services

The option to implement a variety of models as web services was implemented in the Earth System Modeling Framework (ESMF). ESMF is based on the idea of components, which may represent physical domains such as the atmosphere, ocean, or cryosphere, or specific processes such as ocean biogeochemistry. These components have a standard interface that includes a specification of input fields, output fields, and time information. When running on high performance computing systems, ESMF components are usually called as subroutines of a main program. With ESMF web services, the components can be run on multiple computer systems, and can communicate with each other through web protocols.

ESMF web services are currently comprised of a set of SOAP (Simple Object Access Protocol) interfaces implemented using a combination of Apache Tomcat, Axis2, and custom Java classes. The SOAP services provide the gateway between the ESMF components and the Internet.


ESMPy is a Python interface to the Earth System Modeling Framework (ESMF) regridding utility.

ESMF is software for building and coupling weather, climate, and related models. It has a robust, parallel and scalable remapping package, used to generate remapping weights. It can handle a wide variety of grids and options: logically rectangular grids and unstructured meshes; regional or global grids; 2D or 3D; and pole and masking options. ESMF also has capabilities to read grid information from NetCDF files in a variety of formats, including the evolving Climate and Forecast (CF) GridSpec and UGRID conventions. It is currently being merged with the OpenClimateGIS package so that it can also support Geographic Information System (GIS) data formats.

ESMPy supports a single-tile logically rectangular discretization type called Grid and an unstructured discretization type called Mesh (ESMF also supports observational data streams). ESMPy supports bilinear, finite element patch recovery and first-order conservative interpolation methods. There is also an option to ignore unmapped destination points and mask out points on either the source or destination. Regridding on the sphere takes place in 3D Cartesian space, so the pole problem is not an issue as it can be with other Earth system grid remapping software. Grid and Mesh objects can be created in 2D or 3D space, and 3D first-order conservative regridding is fully supported. Future plans for ESMPy involve the incorporation of observational data streams and time operations, in addtion to the GIS formats mentioned previously.


The National Unified Operational Prediction Capability (NUOPC) is a consortium of Navy, NOAA, and Air Force modelers and their research partners. It aims to advance the weather prediction modeling systems used by meteorologists, mission planners, and decision makers. NUOPC partners are working toward a common model architecture - a standard way of building models - in order to make it easier to collaboratively build modeling systems. To this end, they have developed a NUOPC Layer that defines conventions and templates for using the Earth System Modeling Framework (ESMF).


OpenClimateGIS (OCGIS) is a Python package designed for geospatial manipulation, subsetting, computation, and translation of climate datasets stored in local NetCDF files or files served through THREDDS data servers. OpenClimateGIS has a straightforward, request-based API that is simple to use yet complex enough to perform a variety of computational tasks. The software is built entirely from open source packages. ClimateTranslator is a new web interface to the OpenClimateGIS functionality. OpenClimateteGIS is currently being merged with high performance parallel grid remapping capabilities from ESMF, through the ESMPy package.


The central goal of ExaStencils is to develop a radically new software technology for applications with exascale performance. To reach this goal, the project focusses on a comparatively narrow but very important application domain. The aim is to enable a simple and convenient formulation of problem solutions in this domain. The software technology developed in ExaStencils shall facilitate the highly automatic generation of a large variety of efficient implementations via the judicious use of domain-specific knowledge in each of a sequence of optimization steps such that, at the end, exascale performance results.

The application domain chosen is that of stencil codes, i.e., compute-intensive algorithms in which data points in a grid are redefined repeatedly as a combination of the values of neighboring points. The neighborhood pattern used is called a stencil. Stencils codes are used for the solution of discrete partial differential equations and the resulting linear systems.


An EXpression Capturing Finite Element Library is a library developed during my PhD as a means to explore the benefits of using active library techniques for the performance optimisation of finite-element simulations. In particular active library techniques facilitate efficient implementations of domain specific languages.

Excafé only supports triangular meshes with Lagrange basis functions at present. Furthermore boundary integrals have not yet been implemented. However, the functionality present is more than sufficient to implement an incompressible Navier-Stokes solver, which is included in the distribution.

One topic that Excafé has been used to explore is the symbolic analysis of the expressions in finite element local assembly matrices. Excafé has access to run-time representations of variational forms and basis functions. It uses this to build symbolic representations of each entry of the local assembly matrix. Once it has these, it uses a common sub-expression elimination pass targeted at polynomial evaluation to find an evaluation strategy for these expressions that minimizes operation count.


EZFIO is the Easy Fortran I/O library generator. It generates automatically an I/O library from a simple configuration file. The produced library contains Fortran subroutines to read/write the data from/to disk, and to check if the data exists. A Python and an Ocaml API are also provided.

With EZFIO, the data is organized in a file system inside a main directory. This main directory contains subdirectories, which contain files. Each file corresponds to a data. For atomic data the file is a plain text file, and for array data the file is a gzipped text file.


The EvoGrid is a worldwide, cross-disciplinary effort to create an abstract, yet plausible simulation of the chemical origins of life on Earth. One could think of this as an artificial origin of life experiment. Our strategy is to employ a large number of computers in a grid to simulate a digital primordial soup along with a distributed set of computers acting as observers looking into that grid. These observers, modeled after the very successful @Home scientific computation projects, will be looking for signs of emergent complexity and reporting back to the central grid.


A library of math functions targeted at 32-bit and 64-bit x86 Linux systems. The purpose of this library is to provide faster drop in replacements for selected functions of the standard math library libm. These functions are written so they can be more optimized by compilers and all special case tests for increased consistency and accuracy have been removed. They are based on the corresponding implementations from the Cephes math library by Stephen L. Moshier. The code has been simplified perusing internal compiler facilities wherever possible and assuming little endian IEEE-754 single and double precision math.


A C++ parallel programming framework advocating high-level, pattern-based parallel programming. It chiefly supports streaming and data parallelism, targeting heterogenous platforms composed of clusters of shared-memory platforms, possibly equipped with computing accelerators such as NVidia GPGPUs, Xeon Phi, Tilera TILE64.

FastFlow comes as a C++ template library designed as a stack of layers that progressively abstracts out the programming of parallel applications. The goal of the stack is threefold: portability, extensibility, and performance. For this, all the three layers are realised as thin strata of C++ templates that are 1) seamlessly portable; 2) easily extended via subclassing; and 3) statically compiled and cross-optimised with the application. The terse design ensures easy portability on almost all OSes and CPUs with a C++ compiler.


The FASTMathSciDAC Institute develops and deploys scalable mathematical algorithms and software tools for reliable simulation of complex physical phenomena and collaborates with application scientists to ensure the usefulness and applicability of FASTMath technologies.


FAUST (Functional Audio Stream) is a functional programming language specifically designed for real-time signal processing and synthesis. FAUST targets high-performance signal processing applications and audio plug-ins for a variety of platforms and standards.


This project provides an LV2 plugin architecture for the Faust programming language. The package contains the Faust architecture and templates for the needed LV2 manifest (ttl) files, a collection of sample plugins written in Faust, and a generic GNU Makefile for compiling the plugins.


A virtual guitar amplifier for Linux running with jack (Jack Audio Connection Kit). It takes the signal from your guitar as any real amp would do: as a mono-signal from your sound card. Your tone is processed by a main amp and a rack-section. Both can be routed separately and deliver a processed stereo-signal via Jack. You may fill the rack with effects from more than 25 built-in modules spanning from a simple noise-gate to brain-slashing modulation-fx like flanger, phaser or auto-wah. Your signal is processed with minimum latency. On any properly set-up Linux-system you do not need to wait for more than 10 milli-seconds for your playing to be delivered, processed by guitarix. It offers the range of sounds you would expect from a full-featured universal guitar-amp. A great part of guitarix effects is written in Faust.


FBReader is a free (and ad-free) multi-platform ebook reader. It provides access to popular network libraries that contain a large set of ebooks. Download books for free or for a fee. Add your own catalog. It supports popular ebook formats: ePub, fb2, mobi, rtf, html, plain text, and a lot of other formats. It is highly customizable. Choose colors, fonts, page turning animations, dictionaries, bookmarks, etc. to make reading as convenient as you want.


The FEAST solver package is a free high-performance numerical library for solving the standard or generalized eigenvalue problem, and obtaining all the eigenvalues and eigenvectors within a given search interval. It is based on an innovative fast and stable numerical algorithm — named the FEAST algorithm — which deviates fundamentally from the traditional Krylov subspace iteration based techniques (Arnoldi and Lanczos algorithms) or other Davidson-Jacobi techniques. The FEAST algorithm takes its inspiration from the density-matrix representation and contour integration technique in quantum mechanics. It is free from explicit orthogonalization procedures, and its main computational tasks consist of solving very few inner independent linear systems with multiple right-hand sides and one reduced eigenvalue problem orders of magnitude smaller than the original one. The FEAST algorithm combines simplicity and efficiency and offers many important capabilities for achieving high performance, robustness, accuracy, and scalability on parallel architectures.

This general purpose FEAST solver package includes both reverse communication interfaces and ready to use predefined interfaces for dense, banded and sparse systems. It includes double and single precision arithmetic, and all the interfaces are compatible with Fortran (77,90) and C. FEAST is both a comprehensive library package, and an easy to use software. This solver is expected to significantly augment numerical performances and capabilities in large-scale modern applications.

Fedora Playground Repository

The Playground repository gives contributors a place to host packages that are not up to the standards of the main Fedora repository but may still be useful to other users. For now the Playground repository contains both packages that are destined for eventual inclusion into the main Fedora repository and packages that are never going to make it there. Users of the repository should be willing to endure a certain amount of instability when using packages from there.


Feldspar (Functional Embedded Language for DSP and PARallelism) is a domain-specific language with associated code generator mainly targeting digital signal processing algorithms. The project has developed a prototype compiler that generates ISO C99 code for programs written in this high-level language, and plans to target real digital signal processing hardware in the future.


The Finite-Element Sea Ice Model (FESIM) includes the elastic-viscous-plastic (EVP) and viscous-plastic (VP) solvers and employs a flux corrected transport algorithm to advect the ice and snow mean thicknesses and concentration. The model is formulated on unstructured triangular meshes. It assumes a collocated placement of ice velocities, mean thicknesses and concentration at mesh vertices, and relies on piecewise-linear (P1) continuous elements.


FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created. It supports the most obscure ancient formats up to the cutting edge. No matter if they were designed by some standards committee, the community or a corporation.

It contains libavcodec, libavutil, libavformat, libavfilter, libavdevice, libswscale and libswresample which can be used by applications. As well as ffmpeg, ffserver, ffplay and ffprobe which can be used by end users for transcoding, streaming and playing.



A library that puts various Fast Fourier Transform implementations (built-in and third party) under a single interface. The supported backends are KissFFT, Ooura FFT, libavcodec, FFTW, Intel IPP, Intel MKL, NVIDA cuFFT, AMD OpenCL and ViennaCL.


A C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST).


Library fourpack provides a conveninent and uniform interface to Fast Fourier Transform impelemted in Intel Mathemtatical Kernel Library and FFTW. This contains routines that:

  • implements an interface to really fast FFT libraries, FFTW and Intel Math Kernel Library;

  • perform linear filtration with the use of FFT;

  • compute direct and inverse spherical function transform; and

  • compute tuning parameters for FFTW.

This requires petools.


The NFFT (nonequispaced fast Fourier transform or nonuniform fast Fourier transform) is a C subroutine library for computing the nonequispaced discrete Fourier transform (NDFT) and its generalisations in one or more dimensions, of arbitrary input size, and of complex data.


A Python interface for NFFT.


When the data is irregular in either the "physical" or "frequency" domain, unfortunately, the FFT does not apply. Over the last twenty years, a number of algorithms have been developed to overcome this limitation - generally referred to as non-uniform FFTs (NUFFT), non-equispaced FFTs (NFFT) or unequally-spaced FFTs (USFFT). They achieve the same O(N log N) computational complexity, but with a larger, precision-dependent, and dimension-dependent constant.

We have developed some NUFFT libraries in Fortran 77 and Fortran 90 that are freely available under the GPL license.


OpenFFT is an open source parallel package for computing three-dimensional Fast Fourier Transforms (3-D FFTs) of both real and complex numbers of arbitrary input size. It originates from OpenMX (Open source package for Material eXplorer). OpenFFT adopts a communication-optimal domain decomposition method that is adaptive and capable of localizing data when transposing from one dimension to another for reducing the total volume of communication. It is written in C and MPI, with support for Fortran through the Fortran interface, and employs FFTW3 for computing 1-D FFTs.

Ooura FFT

A package to calculate Discrete Fourier/Cosine/Sine Transforms of 1-dimensional sequences of length 2^N. This package contains C and Fortran FFT codes.


Parallel Three-Dimensional Fast Fourier Transforms is a library for large-scale computer simulations on parallel platforms. 3D FFT is an important algorithm for simulations in a wide range of fields, including studies of turbulence, climatology, astrophysics and material science.


A parallel FFT software library based on MPI.


A parallel software library for the calculation of three-dimensional nonequispaced FFTs based. It is available under GPL licence. The parallelization is based on MPI. PNFFT depends on the PFFT and FFTW software library.


The Sparse Fast Fourier Transform is a recent algorithm developed by Hassanieh et al. [2, 3] for computing the the discrete Fourier Transforms on signals with a sparse (exact or approximately) frequency domain. The algorithm improves the asymptotic runtime compared to the prior methods based on pruning.


Feel++ is a C++ library for partial differential equation solves using generalized Galerkin methods such as the finite element method, the h/p finite element method, the spectral element method or the reduced basis method.


Fiber ViewerLight is an open-source C++ application to analyze fiber bundles. FiberViewerLight is now available as a 3D Slicer extension.


A library for fast computation of Gauss transforms in multiple dimensions, using the Improved Fast Gauss Transform and Approximate Nearest Neighbor searching. This software allows for efficient computation of probabilities by Kernel Density Estimation (KDE), and can reduce complexity of algorithms commonly used in Computer Vision, Machine Learning, etc, that must evaluate the Gauss transform.

file system


The advanced multi layered unification filesystem implements a union mount for Linux file systems.


A high-performance parallel file system from the Fraunhofer Center for High Performance Computing. BeeGFS is a pure software solution for scale-out parallel network-accessible storage, developed with a strong focus on performance and designed for very easy installation and management. If I/O intensive workloads are your problem, BeeGFS is the solution.

BeeGFS provides a common file system for shared access to multiple clients and transparently spreads user data across multiple servers. By increasing the number of servers and/or disks in the system, you can simply scale performance and capacity of the file system to the level that you need.


Ceph is a free software storage platform designed to present object, block, and file storage from a single distributed computer cluster. Ceph’s main goals are to be completely distributed without a single point of failure, scalable to the exabyte level, and freely-available. The data is replicated, making it fault tolerant. Ceph software runs on commodity hardware. The system is designed to be both self-healing and self-managing and strives to reduce both administrator and budget overhead.


A network file system based on HTTP and optimized to deliver experiment software in a fast, scalable, and reliable way. Files and file metadata are aggressively cached and downloaded on demand. Thereby the CernVM-FS decouples the life cycle management of the application software releases from the operating system.


Chirp is a user-level file system for collaboration across distributed systems such as clusters, clouds, and grids. Chirp allows ordinary users to discover, share, and access storage, whether within a single machine room or over a wide area network.

Chirp requires no special privileges. Unlike most standard filesystems or storage services, Chirp does not require root access, kernel changes, special modules, or anything like that. It can be run by ordinary users to export ordinary filesystems on any machine or port that you like.

Chirp is transparent. When used with Parrot or FUSE, Chirp servers can be transparently attached to existing ordinary applications — like tcsh, vi, and perl — without any sort of kernel changes or special privileges. Chirp is designed to give maximum compatibility with standard Unix semantics.

Chirp is easy to deploy. Chirp is designed to be deployed with a minimum of fuss. One simple command starts a Chirp server or a Chirp client. There is no complex configuration, installation, or setup to mess up. It just works. This makes Chirp ideal for on-the-fly storage management in batch computing and grid computing environments.


Filesystem in Userspace (FUSE) is an operating system mechanism for Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in user space while the FUSE module provides only a "bridge" to the actual kernel interfaces.

FUSE is particularly useful for writing virtual file systems. Unlike traditional file systems that essentially save data to and retrieve data from disk, virtual filesystems do not actually store data themselves. They act as a view or translation of an existing file system or storage device.


CloudFusion lets you access a multitude of cloud storages from Linux like any file on your desktop. Work with files from Dropbox, Sugarsync, Amazon S3, Google Storage, Google Drive, and WebDAV storages like any other file on your desktop.


GlusterFS is an open source, distributed file system capable of scaling to several petabytes (actually, 72 brontobytes!) and handling thousands of clients. GlusterFS clusters together storage building blocks over Infiniband RDMA or TCP/IP interconnect, aggregating disk and memory resources and managing data in a single global namespace. GlusterFS is based on a stackable user space design and can deliver exceptional performance for diverse workloads.


The goal of the present HTTPFS project is to enable access to remote files, directories, and other containers (e.g., structured text documents, OS tables) through an HTTP pipe. HTTPFS system permits retrieval, creation and modification of these resources as if they were regular files and directories on a local filesystem. The remote host can be any UNIX or Win9x/WinNT box that is capable of running a Perl CGI script, and accessible either directly or via a web proxy or a gateway. HTTPFS runs entirely in user space. The current implementation fully supports reading as well as creating, writing, appending, and truncating of files on a remote HTTP host. HTTPFS provides an isolation level for concurrent file access stronger than the one mandated by POSIX file system semantics, closer to that of AFS. Both a programmatic interface with familiar open(), read(), write(), close(), etc. calls, and an interactive interface, via the popular Midnight Commander file browser, are provided.


An open-source, parallel file system that provides a POSIX compliant file system interface, can scale to thousands of clients, petabytes of storage and hundreds of gigabytes per second of I/O bandwidth. The key components of the Lustre file system are the Metadata Servers (MDS), the Metadata Targets (MDT), Object Storage Servers (OSS), Object Server Targets (OST) and the Lustre clients.

The ability of a Lustre file system to scale capacity and performance for any need reduces the need to deploy many separate file systems, such as one for each compute cluster. Storage management is simplified by avoiding the need to copy data between compute clusters. In addition to aggregating storage capacity of many servers, the I/O throughput is also aggregated and scales with additional servers. Moreover, throughput and/or capacity can be easily increased by adding servers dynamically.


Parrot is a tool for attaching old programs to new storage systems. Parrot makes a remote storage system appear as a file system to a legacy application. Parrot does not require any special privileges, any recompiling, or any change whatsoever to existing programs. It can be used by normal users doing normal tasks.

Parrot is useful to users of distributed systems, because it frees them from rewriting code to work with new systems and relying on remote administrators to trust and install new software. Parrot is also useful to developers of distributed systems, because it allows rapid deployment of new code to real applications and real users that do not have the time, inclination, or permissions to build a kernel-level filesystem.

Parrot "speaks" a variety of remote I/O services include HTTP, FTP, GridFTP, iRODS, HDFS, XRootD, GROW, and Chirp on behalf of ordinary programs. It works by trapping a program’s system calls through the ptrace debugging interface, and replacing them with remote I/O operations as desired.

Robinhood Policy Engine

Robinhood Policy Engine is a versatile tool to manage contents of large file systems. It maintains a replicate of filesystem medatada in a database that can be queried at will. It makes it possible to schedule mass action on filesystem entries by defining attribute-based policies, provides fast find and du enhanced clones, gives to administrators an overall view of filesystem contents through its web UI and command line tools. It supports any POSIX filesystem and implements advanced features for Lustre filesystems (list/purge files per OST or pool, read MDT changelogs…​)

Originally developped for HPC, it has been designed to perform all its tasks in parallel, so it is particularly adapted for running on large filesystems with millions of entries and petabytes of data. But of course, you can take benefits of all its features for managing smaller filesystems, like /tmp of workstations.

file transfer


Warp speed Data Transfer (WDT) is an embeddedable library (and command line tool) aiming to transfer data between 2 systems as fast as possible over multiple TCP paths.


Fiona is designed to be simple and dependable. It focuses on reading and writing data in standard Python IO style, and relies upon familiar Python types and protocols such as files, dictionaries, mappings, and iterators instead of classes specific to OGR. Fiona can read and write real-world data using multi-layered GIS formats and zipped virtual file systems and integrates readily with other Python GIS packages such as pyproj, Rtree, and Shapely.


Firedrake is an automated system for the portable solution of partial differential equations using the finite element method (FEM). Firedrake enables users to employ a wide range of discretisations to an infinite variety of PDEs and employ either conventional CPUs or GPUs to obtain the solution.

Firedrake employs the Unifed Form Language (UFL) and FEniCS Form Compiler (FFC) from the FEniCS Project while the parallel execution of FEM assembly is accomplished by the PyOP2 system. The global mesh data structures, as well as


Fireflies (formally "DSvis") is a tool that allows you to visually explore (in 2D or 3D) the dynamics of arbitrary systems of ordinary differential equations (ODEs). It does this by using the GPU (via the OpenCL library) to simulate a large number of independent particles according to the specified system of ODEs. By integrating one group of particles forwards in time and the other backwards, the stable and unstable attractors of the system are revealed and the way in which they change with the system’s parameters can be seen interactively.


A pure Python toolkit for creating graphical user interfaces (GUI’s), that uses web technology for its rendering. You can use Flexx to create desktop applications, web applications, and (if designed well) export an app to a standalone HTML document. It also works in the Jupyter notebook.


An open source, general purpose, multi-phase computational fluid dynamics code capable of numerically solving the Navier-Stokes equation and accompanying field equations on arbitrary unstructured finite element meshes in one, two and three dimensions.

It is used in a number of different scientific areas including geophysical fluid dynamics, computational fluid dynamics, ocean modelling and mantle convection. It uses a finite element/control volume method which allows arbitrary movement of the mesh with time dependent problems, allowing mesh resolution to increase or decrease locally according to the current simulated state. It has a wide range of element choices including mixed formulations.

Fluidity is parallelised using MPI and is capable of scaling to many thousands of processors. Other innovative and novel features are a user-friendly GUI and a python interface which can be used to calculate diagnostic fields, set prescribed fields or set user-defined boundary conditions.


FortranCL is an OpenCL interface for Fortran 90. It allows programmers to call the OpenCL parallel programming framework directly from Fortran, so developers can accelerate their Fortran code using graphical processing units (GPU) and other accelerators.

The interface is designed to be as close to C OpenCL interface as possible, while written in native Fortran 90 with type checking. It was originally designed as an OpenCL interface to be used by the Octopus code.

The interface is not complete but provides all the basic calls required to write a full Fortran 90 OpenCL program.


Freenet is free software which lets you anonymously share files, browse and publish "freesites" (web sites accessible only through Freenet) and chat on forums, without fear of censorship. Freenet is decentralised to make it less vulnerable to attack, and if used in "darknet" mode, where users only connect to their friends, is very difficult to detect.

Communications by Freenet nodes are encrypted and are routed through other nodes to make it extremely difficult to determine who is requesting the information and what its content is.

Users contribute to the network by giving bandwidth and a portion of their hard drive (called the "data store") for storing files. Files are automatically kept or deleted depending on how popular they are, with the least popular being discarded to make way for newer or more popular content. Files are encrypted, so generally the user cannot easily discover what is in his datastore, and hopefully can’t be held accountable for it. Chat forums, websites, and search functionality, are all built on top of this distributed data store.


Furious.js is a scientific computing package for JavaScript that was inspired by Numpy.


Gaigen is a program which can generate implementations of geometric algebras. It generates C++ and C source code which implements a geometric algebra requested by the user. The choice to create a program which generates implementations of these algebras was made because we wanted performance similar to optimized hand-written code, while maintaining full generality; for (scientific) research and experimentation, many geometric algebras with different dimensionality, signatures and other properties may be required. Instead of coding each algebra by hand, Gaigen provides the possibility to generate the code for exactly the geometric algebra the user requires. This code may be less efficient than fully optimized hand-written code, but is likely to be much more efficient than one library which tries to support all possible algebras at once. Gaigen supports algebras with a dimension from 0 to 8. The implementation of products used in Gaigen becomes infeasable for dimensions higher than about 7 or 8. For basis vectors, all 3 signatures are supported (-1, 0, +1). It is also possible to create reciprocal pairs of null vectors, which square to 0 with themselves, but to +1 or -1 with the other. 7 basic products are implemented (geometric product, outer product, left and right contraction, scalar product, (modified) Hestenes inner product) plus the outer morphism operator and the delta product. Several useful functions (such as factorization, meet and join) have been implemented. Everything has been designed with memory and time efficiency in mind. It is possible to optimize Gaigen for your platform, application or processor by replacing the lowest computation layer. Gaigen can suggest optimizations for the algebras you generate with it by using the provided profiler function. Benchmarks in a ray tracing application show that Gaigen is 30 to 60 times faster than CLU (C++). In another application, Gaigen was 6000 times faster than Gable (Matlab).


GALAHAD is a thread-safe library of Fortran 2003 packages for solving nonlinear optimization problems. At present, the areas covered by the library are unconstrained and bound-constrained optimization, quadratic programming, nonlinear programming, systems of nonlinear equations and inequalities, and nonlinear least squares problems.


Galois is a system that automatically executes "Galoized" serial C++ or Java code in parallel on shared-memory machines. It works by exploiting amorphous data-parallelism, which is present even in irregular codes that are organized around pointer-based data structures such as graphs and trees. The Galois system includes the Lonestar benchmark suite and the ParaMeter profiler.

Multicore processors are becoming increasingly the norm. As a result, we need to find ways to make it easier to write parallel programs. Galois allows the programmer to write serial C++ or Java code while still getting the performance of parallel execution. All the programmer has to do is use Galois-provided data structures, which are necessary for correct concurrent execution, and annotate which loops should be run in parallel. The Galois system then speculatively extracts as much parallelism as it can. The current release includes a dozen sample benchmarks applications from a broad range of domains that are written using the Galois extensions and classes.

Lonestar and LonestarGPU benchmark collections are collections of widely-used real-world applications that exhibit irregular behavior.


Galry is a high performance interactive visualization package in Python based on OpenGL. It allows to interactively visualize very large plots (tens of millions of points) in real time, by using the graphics card as much as possible.

Galry’s high-level interface is directly inspired by Matplotlib and Matlab. The low-level interface can be used to write complex interactive visualization GUIs with Qt that deal with large 2D/3D datasets.

Visualization capabilities of Galry are not restricted to plotting, and include textures, 3D meshes, graphs, shapes, etc. Custom shaders can also be written for advanced uses.


PGAS (Partitioned Global Address Space) programming models have been discussed as an alternative to MPI for some time. The PGAS approach offers the developer an abstract shared address space which simplifies the programming task and at the same time facilitates: data-locality, thread-based programming and asynchronous communication. The goal of the GASPI project is to develop a suitable programming tool for the wider HPC-Community by defining a standard with a reliable basis for future developments through the PGAS-API of Fraunhofer ITWM. Furthermore, an implementation of the standard as a highly portable open source library will be available. The standard will also define interfaces for performance analysis, for which tools will be developed in the project. The evaluation of the libraries is done via the parallel re-implementation of industrial applications up to and including production status.


GPI-2 implements the GASPI specification (, an API specification which originates from the ideas and concepts GPI. GPI-2 is an API for asynchronous communication. It provides a flexible, scalable and fault tolerant interface for parallel applications.


Robot simulation is an essential tool in every roboticist’s toolbox. A well-designed simulator makes it possible to rapidly test algorithms, design robots, and perform regression testing using realistic scenarios. Gazebo offers the ability to accurately and efficiently simulate populations of robots in complex indoor and outdoor environments. At your fingertips is a robust physics engine, high-quality graphics, and convenient programmatic and graphical interfaces.


G-code (also RS-274), which has many variants, is the common name for the most widely used numerical control (NC) programming language. It is used mainly in computer-aided manufacturing for controlling automated machine tools. G-code is sometimes called G programming language.

In fundamental terms, G-code is a language in which people tell computerized machine tools how to make something. The how is defined by instructions on where to move, how fast to move, and through what path to move. The most common situation is that, within a machine tool, a cutting tool is moved according to these instructions through a toolpath, cutting away excess material to leave only the finished workpiece. The same concept also extends to noncutting tools such as forming or burnishing tools, photoplotting, additive methods such as 3D printing, and measuring instruments.


Skeinforge is a tool chain composed of Python scripts that converts your 3D model into G-Code instructions for RepRap.


A translator library for raster and vector geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single raster abstract data model and vector abstract data model to the calling application for all supported formats. It also comes with a variety of useful commandline utilities for data translation and processing.


Reads and writes shapefiles using GDAL in the background and is therefore quite fast. Intended as an easier to use alternative than the original Python GDAL bindings.

Fiona is designed to be simple and dependable. It focuses on reading and writing data in standard Python IO style and relies upon familiar Python types and protocols such as files, dictionaries, mappings, and iterators instead of classes specific to OGR. Fiona can read and write real-world data using multi-layered GIS formats and zipped virtual file systems and integrates readily with other Python GIS packages such as pyproj, Rtree, and Shapely.


GDAL Python wrapper for reading and writing geospatial data to a variety of vector formats.


Rasterio reads and writes geospatial raster datasets. It employs GDAL under the hood for file I/O and raster formatting. Its functions typically accept and return Numpy ndarrays. Rasterio is designed to make working with geospatial raster data more productive.


The GeM software is designed to automate the generation of determining equations and related operations, in order to compute symmetries and conservation laws for any ODE/PDE system, generally without limitations in DE order and number of variables.

ODE/PDE systems containing arbitrary functions and/or constants can be analyzed, and classes of functions for which additional symmetries / conservation laws occur can be isolated.

GeM output (determining equations) is usually fed into Maple "rifsimp" (a stable routine for differential reduction), which simplifies determining equations, and performs case splits when the given system contains arbitrary functions and/or constants.

GeM also contains special routines to output computed symmetries as well as fluxes/densities of computed conservation laws.


This project provides tools to play with geographical data. It also works with non-geographical data, except for map visualizations. There are embedded data sources in the project, but you can easily play with your own data in addition to the available ones. Csv files containing data about airports, train stations, countries, … are loaded, then you can:

  • performs various types of queries ( find this key, or find keys with this property)

  • fuzzy searches based on string distance ( find things roughly named like this)

  • geographical searches ( find things next to this place)

  • get results on a map, or export it as csv data, or as a Python object


A JavaScript framework that combines the GIS functionality of OpenLayers with the user interface of the ExtJS library. It enables the construction of desktop-like GIS applications on the web.


High-level components for GeoExt-based applications.


Geogram is a programming library of geometric algorithms. It includes a simple yet efficient Mesh data structure (for surfacic and volumetric meshes), exact computer arithmetics (a-la Shewchuck, implemented in GEO::expansion), a predicate code generator (PCK: Predicate Construction Kit), standard geometric predicates (orient/insphere), Delaunay triangulation, Voronoi diagram, spatial search data structures, spatial sorting) and less standard ones (more general geometric predicates, intersection between a Voronoi diagram and a triangular or tetrahedral mesh embedded in n dimensions). The latter is used by FWD/WarpDrive, the first algorithm that computes semi-discrete Optimal Transport in 3d that scales up to 1 million Dirac masses.[+


JavaScript Geo visualization and Analysis Library.


GeoJSON[1] is an open standard format for encoding collections of simple geographical features along with their non-spatial attributes using JavaScript Object Notation. The features include points (therefore addresses and locations), line strings (therefore streets, highways and boundaries), polygons (countries, provinces, tracts of land), and multi-part collections of these types. GeoJSON features need not represent entities of the physical world only; mobile routing and navigation apps, for example, might describe their service coverage using GeoJSON.[2]

The GeoJSON format differs from other GIS standards in that it was written and is maintained not by a formal standards organization, but by an Internet working group of developers.


A simple Python GeoJSON file reader and writer. PyGeoj treats GeoJSON as an actual file format instead of a set of formatting rules.


Python bindings and utilities for GeoJSON.


An extension of GeoJSON that encodes topology. Rather than representing geometries discretely, geometries in TopoJSON files are stitched together from shared line segments called arcs.[18] Arcs are sequences of points, while line strings and polygons are defined as sequences of arcs. Each arc is defined only once, but can be referenced several times by different shapes, thus reducing redundancy and decreasing the file size.[19] In addition, TopoJSON facilitates applications that use topology, such as topology-preserving shape simplification, automatic map coloring, and cartograms.


A powerful, metadata-driven Spatial ETL tool dedicated to the integration of different spatial data sources for building and updating geospatial data warehouses. GeoKettle enables the Extraction of data from data sources, the Transformation of data in order to correct errors, make some data cleansing, change the data structure, make them compliant to defined standards, and the Loading of transformed data into a target DataBase Management System (DBMS) in OLTP or OLAP/SOLAP mode, GIS file or Geospatial Web Service.


GeoMapApp is an earth science exploration and visualization application that is continually being expanded as part of the Marine Geoscience Data System (MGDS) at the Lamont-Doherty Earth Observatory of Columbia University. The application provides direct access to the Global Multi-Resolution Topography (GMRT) compilation that hosts high resolution (~100 m node spacing) bathymetry from multibeam data for ocean areas and ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) and NED (National Elevation Dataset) topography datasets for the global land masses.


The GEOS–Chem model is a global three-dimensional model of tropospheric chemistry driven by assimilated meteorological observations from the Goddard Earth Observing System (GEOS) of the NASA Global Modeling Assimilation Office.

GEOS–Chem began as a merging of Mian Chin’s GEOS–CTM code with the emissions, dry deposition, and chemistry routines from the old Harvard–GISS 9-layer model. Since then, we have added many updates and improvements to GEOS–Chem. The model now uses detailed inventories for fossil fuel, biomass burning, biofuel burning, biogenic, and aerosol emissions. GEOS–Chem includes state-of-the-art transport (TPCORE) and photolysis (FAST–J) routines, as well as the SMVGEAR II chemistry solver package. Detailed aerosol microphysical simulations using GEOS–Chem may performed with the TOMAS aerosol microphysics code or the APM aerosol microphysics code.

GEOS–Chem has been parallelized using the OpenMP compiler directives, and it scales well when running across multiple CPU’s on shared-memory machines. We are currently building a Grid-Independent version of GEOS-Chem in order to take advantage of distributed memory architectures and MPI parallelization.

Several software tools facilitate the visualization of GEOS-Chem model outputs. The IDL-based GAMAP package—which is developed and maintained by the GEOS–Chem Support Team—allows for easy generation of a wide variety of plots and animations. Furthermore, several members of the GEOS-Chem user community are now developing open-source software visualization tools for other computer languages, including Matlab, NCL, R, and Python.

geoscience data servers


The Dapper Data Viewer (aka DChart) allows you to visualize and download in-situ oceanographic or atmospheric data from file or OpenDap server. Features include an interactive map that is draggable, an in-situ station layer that allows you to select data stations, and a plot window that allows you to plot data from one or more stations. Three plot types are supported (profile, property-property, and time series) and users can interact directly with the plot to pan or zoom in and out.

Dapper is an OPeNDAP/Java-based web server developed by the EPIC group at PMEL that provides networked access to in-situ and gridded data. The Dapper servlet contains a set of configurable services that convert in-situ or gridded data to the OPeNDAP protocol.


ERDDAP is a data server that gives you a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. This particular ERDDAP installation has oceanographic data.


An implementation of a broker catalog service that allows clients to discover and evaluate geoinformation resources over a federation of data sources, and publishes different catalog interfaces, allowing different clients to use the service. A data provider can deploy his/her own GI-cat instance, grouping together disparate data sources, to accommodate his/her users' needs.

GI-cat features caching and mediation capabilities and can act as a broker towards disparate catalog and access services: by implementing metadata harmonization and protocol adaptation, it is able to transform query results to a uniform and consistent interface. GI-cat is based on a service-oriented framework of modular components and can be customized and tailored to support different deployment scenarios.

GI-cat can access a multiplicity of catalogs services, as well as inventory and access services to discover, and possibly access, heterogeneous ESS resources. Specific components implement mediation services for interfacing heterogeneous service providers which expose multiple standard specifications; they are called Accessors. These mediating components map the heterogeneous providers metadata models into a uniform data model which implements ISO 19115, based on official ISO 19139 schemas and its extensions (check out more information about the internal GI-cat format) . Accessors also implement the query protocol mapping; they translate the query requests expressed according to the interface protocols exposed by GI-cat, into the multiple query dialects spoken by the resource service providers. Currently, a number of well-accepted catalog and inventory services are supported, including several OGC Web Services (e.g. WCS, WMS), THREDDS Data Server, SeaDataNet Common Data Index, and GBIF.


Hyrax is a new data server which combines the efforts at UCAR/HAO to build a high performance DAP-compliant data server for the Earth System Grid II project with existing software developed by OPeNDAP.

Hyrax uses the Java servlet mechanism to hand off requests from a general web daemon to DAP format-specific software. This results in higher performance for small requests. The servlet front end, which we call the OPeNDAP Lightweight Front end Server (OLFS) looks at each request and formulates a query to a second server (which may or may not on the same machine as the OLFS) called the Back End Server (BES).

The BES is the high-performance server software from HAO. It handles reading data from the data stores and returning DAP-compliant responses to the OLFS. In turn, the OLFS may pass these response back to the requestor with little or no modification or it may use them to build more complex responses. The nature of the Inter Process Communication (IPC) between the OLFS and BES is such that they should both be on the same machine or be able to communicate over a very high bandwidth channel.


The Live Access Server (LAS) is a highly configurable web server designed to provide flexible access to geo-referenced scientific data. It can present distributed data sets as a unified virtual data base through the use of DODS networking. Ferret is the default visualization application used by LAS.


The THREDDS Data Server (TDS) is a web server that provides metadata and data access for scientific datasets, using a variety of remote data access protocols. The THREDDS project consists of:

  • netCDF-Java/CDM library;

  • the NetCDF Markup Language (NcML);

  • the THREDDS Data Server (TDS); and

  • the THREDDS Catalog specification.

See also NetCDF Java.


NcSOS adds an OGC SOS service to datasets in your existing THREDDS server. It complies with the IOOS SWE Milestone 1.0 templates and requires your datasets be in any of the CF 1.6 Discrete Sampling Geometries.

NcSOS acts like other THREDDS services (such an OPeNDAP and WMS) where as there are individual service endpoints for each dataset. It is best to aggregate your files and enable the NcSOS service on top of the aggregation. i.e. The NcML aggregate of hourly files from an individual station would be a good candidate to serve with NcSOS. Serving the individual hourly files with NcSOS would not be as beneficial.


A collection of Python utilities for downloading data from Unidata data technologies, i.e. from a THREDDS server.


Contents of TDS configuration directories for several variants including a TDS serving all Unidata IDD data.


Styles for a THREDDS server.


Python Geographic Visualizer (GeoVis) is a standalone geographic visualization module for the Python programming language intended for easy everyday-use by novices and power-programmers alike. It has one-liners for quickly visualizing a shapefile, building and styling basic maps with multiple shapefile layers, and/or saving to imagefiles. Uses the built-in Tkinter or other third-party rendering modules to do its main work.


A control system framework for personal fabrication.


Gestalt is a framework for building controllers for automated tools. It enables you to import your machines as Python modules, and makes it easy to connect machines to browser-based user interfaces.


Up-to-date meta-databases are vital for the analysis of biological data. The current exponential increase in biological data is also exponentially increasing meta-database sizes. Large-scale meta-database management is therefore an important challenge for platforms providing services for biological data analysis. In particular, there is a need either to run an analysis with a particular version of a meta-database, or to rerun an analysis with an updated meta-database. We present our GeStore approach for biological meta-database management. It provides efficient storage and runtime generation of specific meta-database versions, and efficient incremental updates for biological data analysis tools. The approach is transparent to the tools, and we provide a framework that makes it easy to integrate GeStore with biological data analysis frameworks. We present the GeStore system, as well as an evaluation of the performance characteristics of the system, and an evaluation of the benefits for a biological data analysis workflow.


The General, Hybrid, and Optimized Sparse Toolkit (GHOST) is a collection of building blocks that targets algorithms dealing with sparse matrix representations on current and future large-scale systems. It implements the "MPI+X" paradigm, has a pure C interface, and provides hybrid-parallel numerical kernels, intelligent resource management, and truly heterogeneous parallelism for multicore CPUs, Nvidia GPUs, and the Intel Xeon Phi.


Applications for the GHOST package.



The System for Automated Geoscientific Analyses is a GIS package with immense capabilities for geodata processing and analysis. SAGA is programmed in the object oriented C++ language and supports the implementation of new functions with a very effective Application Programming Interface (API). Functions are organised as modules in framework independent Module Libraries and can be accessed via SAGA’s Graphical User Interface (GUI) or various scripting environments (shell scripts, Python, R, etc.).


Da bomb.

Building maintainable step-by-step tutorials with Git -


A git repository browser that can generate static HTML instead of having to run dynamically.

It is smaller, with less features and a different set of tradeoffs than other similar software, so if you’re looking for a robust and featureful git browser, please look at gitweb or cgit instead.

However, if you want to generate static HTML at the expense of features, then it can be useful.


A sleek and powerful git GUI that is written in Python.


A version controlled file system.


The hub subcommand for git, allows you to perform many of the operations made available by GitHub’s v3 REST API, from the git commandline command.

You can fork, create, delete and modify repositories. You can get information about users, repositories and issues. You can star, watch and follow things, and find out who else is doing the same. The API is quite extensive. With this command you can do many of your day to day GitHub actions without needing a web browser.

You can also chain commands together using the output of one as the input of another. For example you could use this technique to clone all the repos of a GitHub user or organization, with one command.


GitLab is an advanced Git-repository manager. It introduces a powerful code review and issue-tracking system, complete with GitLab CI: a powerful continuous integration tool.


Gitless is an experimental version control system built on top of Git. Many people complain that Git is hard to use. We think the problem lies deeper than the user interface, in the concepts underlying Git. Gitless is an experiment to see what happens if you put a simple veneer on an app that changes the underlying concepts. Because Gitless is implemented on top of Git (could be considered what Git pros call a porcelain of Git), you can always fall back on Git. And of course your coworkers you share a repo with need never know that you’re not a Git aficionado.


This git command "clones" an external git repo into a subdirectory of your repo. Later on, upstream changes can be pulled in, and local changes can be pushed back. Simple.

Git Town

An open-source Git plugin. It adds Git commands that make collaborative software development more efficient and safe.


A self-hosted Git service written in Go.


A source code management system that supports two leading version control systems, Mercurial and Git, and has a web interface that is easy to use for users and admins. You can install Kallithea on your own server and host repositories for the version control system of your choice.


Create, edit and (optionally) display a journal article, entirely in GitHub. In contrast to the more traditional process of submit > peer review > publish at PeerJ, or even the less formal preprints at PeerJ Preprints or arXiv, Paper Now is an experiment to see where the future may go with scholarly communication. Initially, it may be that co-authors collaborate either privately or publicly on GitHub and then proceed to submitting to PeerJ or other journals for formal peer-review or preprinting. Or perhaps this is where the traditional medium of publication begins to diverge. There is no end goal other than to see what the academic community wants, which is why this is completely open to fork, extend, and build upon.


Givaro is a C++ library for arithmetic and algebraic computations. Its main features are implementations of the basic arithmetic of many mathematical entities: Primes fields, Extensions Fields, Finite Fields, Finite Rings, Polynomials, Algebraic numbers, Arbitrary precision integers and rationals (C++ wrappers over gmp) It also provides data-structures and templated classes for the manipulation of basic algebraic objects, such as vectors, matrices (dense, sparse, structured), univariate polynomials (and therefore recursive multivariate). It contains different program modules and is fully compatible with the LinBox linear algebra library and the KAAPI kernel for Adaptative, Asynchronous Parallel and Interactive programming.


Gizeh is a Python library for vector graphics. Gizeh is written on top of the module cairocffi, which is a Python binding of the popular C library Cairo. Cairo is powerful, but difficult to learn and use. Gizeh implements a few classes on top of Cairo that make it more intuitive.


This site summarises the 1D lake water balance and vertical stratification model: “The General Lake Model” (GLM).


Lasso and elastic-net regularized generalized linear models. This is a Matlab port for the extremely efficient procedures for fitting the entire lasso or elastic-net path for linear regression, logistic and multinomial regression, Poisson regression the Cox model.


This page contains the codes for learning the Granger causality in different settings. The codes are written in Matlab and depend on the GLMnet package for performing Lasso. Lasso-Granger is an efficient algorithm for learning the temporal dependency among multiple time series based on variable selection using Lasso. Copula-Granger extends the power of Lasso-Granger to non-linear datasets. It uses the copula technique to separate the marginal properties of the joint distribution from its dependency structure.


We describe glsim, a C++ library designed to provide routines to perform basic housekeeping tasks common to a very wide range of simulation programs, such as reading simulation parameters or reading and writing self-describing binary files with simulation data. The design also provides a framework to add features to the library while preserving its structure and interfaces.


Glumpy is a python library for scientific visualization that is both fast, scalable and beautiful. Glumpy offers an intuitive interface between numpy and modern OpenGL.

GNU Radio

GNU Radio is a free software development toolkit that provides the signal processing runtime and processing blocks to implement software radios using readily-available, low-cost external RF hardware and commodity processors. It is widely used in hobbyist, academic and commercial environments to support wireless communications research as well as to implement real-world radio systems.

GNU Radio applications are primarily written using the Python programming language, while the supplied, performance-critical signal processing path is implemented in C++ using processor floating point extensions where available. Thus, the developer is able to implement real-time, high-throughput radio systems in a simple-to-use, rapid-application-development environment.


Gqrx is a software defined radio receiver powered by the GNU Radio SDR framework and the Qt graphical toolkit. Gqrx supports many of the SDR hardware available, including Funcube Dongles, rtl-sdr, HackRF and USRP devices.


Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.

Boom Filters

Probabilistic data structures for processing continuous, unbounded streams.

Boom Filters are probabilistic data structures for processing continuous, unbounded streams. This includes Stable Bloom Filters, Scalable Bloom Filters, Counting Bloom Filters, Inverse Bloom Filters, Cuckoo Filters, several variants of traditional Bloom filters, HyperLogLog, Count-Min Sketch, and MinHash.


The Generalized Ocean Layer Dynamics and is a hybrid coordinate finite volume ocean model code funded by NOAA and developed by the ocean group at NOAA-GFDL and Princeton University. The GOLD code was used for the ocean model component of the Earth System Model ESM2G.

Google Earth Engine

Google Earth Engine brings together the world’s satellite imagery — trillions of scientific measurements dating back almost 40 years — and makes it available online with tools for scientists, independent researchers, and nations to mine this massive warehouse of data to detect changes, map trends and quantify differences on the Earth’s surface. Applications include: detecting deforestation, classifying land cover, estimating forest biomass and carbon, and mapping the world’s roadless areas.


GPTIPS is a free symbolic data mining platform and interactive modelling environment for MATLAB.


Gradle is an open source build automation system. Gradle can automate the building, testing, publishing, deployment and more of software packages or other types of projects such as generated static websites, generated documentation or indeed anything else.

Gradle is a project automation tool that builds upon the concepts of Apache Ant and Apache Maven and introduces a Groovy-based domain-specific language (DSL) instead of the more traditional XML form of declaring the project configuration.

Gradle was designed for multi-project builds which can grow to be quite large, and supports incremental builds by intelligently determining which parts of the build tree are up-to-date, so that any task dependent upon those parts will not need to be re-executed.

The initial plugins are primarily focused around Java, Groovy and Scala development and deployment, but more languages and project workflows are on the roadmap.


Graphite is an open-source, distributed parallel simulator for multicore architectures. Graphite is designed from the ground up for exploration of future multicore processors containing dozens, hundreds, or even thousands of cores. It provides high performance for fast design space exploration and software development.

Graphite (3D)

Graphite is a research platform for computer graphics, 3D modeling and numerical geometry.


Grappa makes an entire cluster look like a single, powerful, shared-memory machine. By leveraging the massive amount of concurrency in large-scale data-intensive applications, Grappa can provide this useful abstraction with high performance. Unlike classic distributed shared memory (DSM) systems, Grappa does not require spatial locality or data reuse to perform well.

Data-intensive, or "Big Data", workloads are an important class of large-scale computations. However, the commodity clusters they are run on are not well suited to these problems, requiring careful partitioning of data and computation. A diverse ecosystem of frameworks have arisen to tackle these problems, such as MapReduce, Spark, Dryad, and GraphLab, which ease development of large-scale applications by specializing to particular algorithmic structure and behavior.

Grappa provides abstraction at a level high enough to subsume many performance optimizations common to these data-intensive platforms. However, its relatively low-level interface provides a convenient abstraction for building data-intensive frameworks on top of. Prototype implementations of (simplified) MapReduce, GraphLab, and a relational query engine have been built on Grappa that out-perform the original systems.


The Monash simple climate model is based on the Globally Resolved Energy Balance (GREB) model, which is a climate model published by Dommenget and Floeter [2011] in the international peer review science journal Climate Dynamics. The model simulates most of the main physical processes in the climate system in a very simplistic way and therefore allows very fast and simple climate model simulations. It can compute global climate simulations of one year in about 1 second on a normal PC computer. Despite its simplicity the model simulates the climate response to external forcings, such as doubling of the CO2 concentrations very realistically (similar to state of the art climate models).


A nightmarish data format that’s driven many a geoscientist to drink.

Other packages that can deal with this format are CDO, IDV, NetCDF Java and PyNIO.


An application program interface accessible from C, FORTRAN and Python programs developed for encoding and decoding WMO FM-92 GRIB edition 1 and edition 2 messages. A useful set of command line tools is also provided to give quick access to GRIB messages.


A program to manipulate, inventory and decode GRIB files.


A greatly extended version of wgrib that can handle both GRIB1 and GRIB2 files.


Visualization of weather data from files in GRIB 1 format.


In gRPC a client application can directly call methods on a server application on a different machine as if it was a local object, making it easier for you to create distributed applications and services. As in many RPC systems, gRPC is based around the idea of defining a service, specifying the methods that can be called remotely with their parameters and return types. On the server side, the server implements this interface and runs a gRPC server to handle client calls. On the client side, the client has a stub that provides exactly the same methods as the server.

gRPC clients and servers can run and talk to each other in a variety of environments - from servers inside Google to your own desktop - and can be written in any of gRPC’s supported languages. So, for example, you can easily create a gRPC server in Java with clients in Go, Python, or Ruby. In addition, the latest Google APIs will have gRPC versions of their interfaces, letting you easily build Google functionality into your applications.


The community GSI system is a variational data assimilation system, designed to be flexible, state-of-art, and run efficiently on various parallel computing platforms. The GSI system is in the public domain and is freely available for community use.

The Developmental Testbed Center (DTC) currently maintains and supports a community version of the GSI system (now at Version 3.3). The testing and support of this GSI system at the DTC currently focus on regional numerical weather prediction (NWP) applications coupled with the Weather Research and Forecasting (WRF) Model , but the GSI can be applied to Global Forecast System(GFS) as well as other modelling systems.


The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.


CythonGSL provides a Cython interface for the GNU Scientific Library (GSL). Cython is the ideal tool to speed up numerical computations by converting typed Python code to C and generating Python wrappers so that these compiled functions can be called from Python. Scientific programming often requires use of various numerical routines (e.g. numerical integration, optimization). While SciPy provides many of those tools, there is an overhead associated with using these functions within your Cython code. CythonGSL allows you to shave off that last layer by providing Cython declarations for the GSL which allow you to use this high-quality library from within Cython without any Python overhead.


A portable, object-based Fortran interface to the GNU scientific library, a collection of numerical routines for scientific computing.


The GNU Scientific Library for Lisp (GSLL) allows you to use the GNU Scientific Library (GSL) from Common Lisp. This library provides a full range of common mathematical operations useful to scientific and engineering applications. The design of the GSLL interface is such that access to most of the GSL library is possible in a Lisp-natural way; the intent is that the user not be hampered by the restrictions of the C language in which GSL has been written. GSLL thus provides interactive use of GSL for getting quick answers, even for someone not intending to program in Lisp.


O2scl is a C++ library for object-oriented numerical programming. It includes interpolation, differentiation, integration, roots of polynomials, equation solving, minimization, constrained minimization, Monte Carlo integration, simulated annealing, least-squares fitting, solution of ordinary differential equations, two-dimensional interpolation, Chebyshev approximation, unit conversions, and file I/O with HDF5.


A Fortran 90 input/output library, "gtool5", is developed for use with numerical simulation models in the fields of Earth and planetary sciences. The use of this library will simplify implementation of input/output operations into program code in a consolidated form independent of the size and complexity of the software and data. The library also enables simple specification of the metadata needed for post-processing and visualization of the data. These aspects improve the readability of simulation code, which facilitates the simultaneous performance of multiple numerical experiments with different software and efficiency in examining and comparing the numerical results.


GUESS is an exploratory data analysis and visualization tool for graphs and networks. The system contains a domain-specific embedded language called Gython (an extension of Python, or more specifically Jython) which supports the operators and syntactic sugar necessary for working on graph structures in an intuitive manner. An interactive interpreter binds the text that you type in the interpreter to the objects being visualized for more useful integration. GUESS also offers a visualization front end that supports the export of static images and dynamic movies.


Gun is a persisted distributed cache, part of a NoDB movement. It requires zero maintenance and runs on your own infrastructure. Think of it as "Dropbox for Databases" or a "Self-hosted Firebase". This is an early preview, so check out the github and read on.

Everything gets cached, so your users experience lightning fast response times. Since gun can be embedded anywhere javascript can run, that cache can optionally be right inside your user’s browser using localstorage fallbacks. Updates are then pushed up to the servers when the network is available.


gvSIG Community Edition (CE) is a community driven GIS project fork of gvSIG that will be bundled with SEXTANTE and GRASS GIS. This project is not supported by the gvSIG Association. gvSIG CE is not an official project of gvSIG. gvSIG CE is a fully functional Open Source Desktop GIS that provides powerful visualization (including thematic maps, advanced symbology and labelling), cartography, raster, vector and geoprocessing in a single, integrated software suite.


An optimized HTTP server with support for HTTP/1.x and HTTP/2. H2O is a very fast HTTP server written in C. It can also be used as a library.


The Habanero-C (HC) language under development in the Habanero project at Rice University builds on past work on Habanero-Java, which in turn was derived from X10 v1.5. HC serves as a research testbed for new compiler and runtime software technologies for extreme scale systems for homogeneous and heterogeneous processors.

Habanero-C is designed to be mapped onto hardware platforms with lightweight system software stacks, such as the Customizable Heterogeneous Platform (CHP) being developed in the NSF Expeditions Center for Domain-Specific Computing (CDSC) which includes CPUs, GPUs, and FPGAs. The C foundation also makes it easier to integrate HC with communication middleware for cluster systems, such as MPI and GASNet.

The Habanero-C compiler is written in C++ and is built on top of the ROSE compiler infrastructure, which was also used in the DARPA-funded PACE project at Rice University. The bulk of the Habanero-C runtime has been written from scratch in portable ANSI C. However, a few library routines for low-level synchronization and atomic operations are written in assembly language for the target platform. To date, the Habanero-C runtime has been ported and tested on Intel X86, Cyclops 64, Power7, Sun Niagara 2 and Intel SCC multicore platforms.


The Apache Hadoop project provides an open-source framework for reliable, scalable, distributed computing. As such, it can be deployed and used in the Grid 5000 platform. However, its configuration and management may be sometimes difficult, specially under the dynamic nature of clusters within Grid 5000 reservations. In turn, Execo offers a Python API to manage processes execution. It is well suited for quick and easy creation of reproducible experiments on distributed hosts.

The project presented here is called hadoop_g5k and provides a layer built on top of Execo that allows to manage Hadoop clusters and prepare reproducible experiments in Hadoop. It offers a set of scripts to be used in command-line interfaces and a Python interface.


A Python library that allows you to finely manage unix processes on thousands of remote hosts.


HClib is a library implementation of the Habanero-C language. The reference HClib implementation is built on top of the Open Community Runtime (OCR).


The Haxe programming language is a high level strictly typed programming language which is used by the Haxe compiler to produce cross-platform native code. The Haxe programming language is easy to learn if you are familiar already with either Java, C++, PHP, AS3 or similar object oriented languages. The Haxe programming language has been especially designed in order to adapt the various platforms native behaviors and allow efficient cross-platform development.

The Haxe Compiler is responsible for translating the Haxe programming language to the target platform native source code or binary. Each platform is natively supported, without any overhead coming from running inside a virtual machine. The Haxe Compiler is very efficient and can compile thousands of classes in seconds.

The Haxe standard library provides a common set of highly tested APIs that gives you complete cross-platform behavior. This includes data structures, maths and date, serialization, reflection, bytes, crypto, file system, database access, etc. The Haxe standard library also includes platform-specific API that gives you access to important parts of the platform capabilities, and can be easily extended.

The compiler targets include Flash, Neko, Javascript, Actionscript 3, PHP, C++, Java, Csharp and Python.

Haxe is written in OCaml.

Haxe UI

Create cross-platform, rich user interfaces. Quickly with a single framework.


Massive provide a number of open source libraries and tools that are intended to increase the quality, efficiency and consistency of cross-platform development with Haxe.


Haxelib which downloads node-webkit binary for your platform and makes it accessible via haxelib run node-webkit path/to/index.html. Node Webkit lets you run a Webkit shell on the desktop, meaning you can use Haxe and HTML5 / JS technologies to build your app. It provides full access to the NodeJS APIs so your app can integrate with the system.


Use WxWidgets to create desktop apps with a truly native look and feel on all major platforms. Works with the C++ and Neko targets, and integrates with NME.




The ExaScale IO (ESIO) library provides simple, high throughput input and output of structured data sets using parallel HDF5. ESIO is designed to support reading and writing turbulence simulation restart files but it may be useful in other contexts. The library is written in C99 and may be used by C89 or C++ applications. A Fortran API built atop the F2003 standard ISO_C_BINDING is also available.


Our proposed work consists of three thrust areas that address these contemporary challenges. First, we will provide high performance I/O middleware that makes effective use of computational platforms, researching a number of optimization strategies and deploying them through the HDF5 software. Second, we will improve the productivity of application developers by hiding the complexity of parallel I/O via new auto-tuning and transparent data re-organization techniques, and by extending our existing work in easy-to-use, high-level APIs that expose scientific data models. Third, we will facilitate scientific analysis for users by extending query-based techniques, developing novel in situ analysis capabilities, and making sure that visualization tools use best practices when reading HDF5 data.


An interface to the HDF5 library for Julia.


The H5FDdsm project provides a Virtual File Driver for HDF5, which can be used to link two applications via a virtual file system. One application (server/host) owns a memory buffer, which may be distributed over N processes (DSM buffer) - the second application (client) writes to HDF5 in parallel using M processes and the data is diverted to the DSM host, where it can be read in parallel as if from disk. The file system is bypassed completely and the data is transmitted using one of several network protocols (MPI or TCP over sockets currently supported). Note that the interface can also be used within the same application as a parallel data staging layer, in this case, no connection is required and information is exchanged between processes using MPI.


H5Part is a very simple data storage schema and provides an API that simplifies the reading/writing of the data to the HDF5 file format.

H5Part is a very simple data storage schema and provides an API that simplifies the reading/writing of the data to the HDF5 file format. An important foundation for a stable visualization and data analysis environment is a stable and portable file storage format and its associated APIs. The presence of a "common file storage format," including associated APIs, will help foster a fundamental level of interoperability across the project’s software infrastructure. It will also help ensure that key data analysis capabilities are present during the earliest phases of the software development effort.


ICARUS is a ParaView plug-in interfaced around the H5FDdsm driver for steering and visualizing in-situ HDF5 output of simulation codes.


A Pythonic interface to the HDF5 binary data format.


A web service that implements a REST-based web service for HDF5 data stores.


An important goal of scientific data analysis is to understand the behavior of a system or process based on a sample of the system. In many instances it is possible to observe both input parameters and system outputs, and characterize the system as a high-dimensional function. Such data sets arise, for instance, in large numerical simulations, as energy landscapes in optimization problems, or in the analysis of image data relating to biological or medical parameters. This paper proposes an approach to analyze and visualizing such data sets. The proposed method combines topological and geometric techniques to provide interactive visualizations of discretely sampled high-dimensional scalar fields. The method relies on a segmentation of the parameter space using an approximate Morse-Smale complex on the cloud of point samples. For each crystal of the Morse-Smale complex, a regression of the system parameters with respect to the output yields a curve in the parameter space. The result is a simplified geometric representation of the Morse-Smale complex in the high dimensional input domain. Finally, the geometric representation is embedded in 2D, using dimension reduction, to provide a visualization platform. The geometric properties of the regression curves enable the visualization of additional information about each crystal such as local and global shape, width, length, and sampling densities. The method is illustrated on several synthetic examples of two dimensional functions. Two use cases, using data sets from the UCI machine learning repository, demonstrate the utility of the proposed approach on real data. Finally, in collaboration with domain experts the proposed method is applied to two scientific challenges. The analysis of parameters of climate simulations and their relationship to predicted global energy flux and the concentrations of chemical species in a combustion simulation and their integration with temperature.


Adaptive, or self-aware, computing has been proposed as one method to help application programmers confront the growing complexity of multicore software development.

However, existing approaches to adaptive systems are largely ad hoc and often do not manage to incorporate the true performance goals of the applications they are designed to support.

This project proposed an enabling technology for adaptive computing systems: Application Heartbeats. The Application Heartbeats framework provides a simple, standard programming interface that applications can use to indicate their performance and system software (and hardware) can use to query an application’s performance.


An open source, object-oriented, simple global climate carbon-cycle model. It runs essentially instantaneously while still representing the most critical global scale earth system processes, and is one of a class of models heavily used for for complex climate model emulation and uncertainty analyses.


Hermes2D (Higher-order modular finite element system) is a C++/Python library of algorithms for rapid development of adaptive hp-FEM solvers. hp-FEM is a modern version of the finite element method (FEM) that is capable of extremely fast, exponential convergence.

The Hermes library can be used for a large variety of PDE problems ranging from linear elliptic equations to time-dependent nonlinear multi-physics PDE systems arising in elasticity, structural mechanics, fluid mechanics, acoustics, electromagnetics, and other fields of computational engineering and science.

The Documentation for the Hermes libraries is an extensive set of instructions, information and tutorials related to the use of Hermes and the Finite Element Method. Hermes includes instructions for the installation of collaborating Third Party Libraries (TPLs) as well as an introduction to the mathematics behind the hp-FEM method and detailed instructions on the use and modification of the code.


HHVM is an open-source virtual machine designed for executing programs written in Hack and PHP. HHVM uses a just-in-time (JIT) compilation approach to achieve superior performance while maintaining the development flexibility that PHP provides.


A massively-parallel high-performance x-ray scattering data analysis code. HipGISAXS is a massively parallel software, which we have developed using C++, augmented with MPI, Nvidia CUDA, OpenMP, and parallel-HDF5 libraries, on large-scale clusters of multi/many-cores and graphics processors. HipGISAXS currently supports *NIX based systems, and is able to harness computational power from any general-purpose CPUs including state-of-the-art multicores, as well as Nvidia GPUs and Intel MIC coprocessors. It is able to handle large input data including any custom complex morphology as described in the following, and perform GISAXS simulations at high resolutions.


HLib is a program library for hierarchical matrices and H2-matrices. H-matrices are a powerful tool for representing and working with dense (and sparse) matrices, e.g. from integral or partial differential equations. They allow the complete matrix algebra, e.g. matrix-vector multiplication, matrix addition, multiplication, inversion and factorisation in almost linear time with respect to the number of rows and columns.

HLIBpro contains various algorithms for the approximation of dense matrices, e.g. ACA and HCA, the complete set of available H-algebra, various clustering techniques, e.g. geometric and algebraic clustering, many functions for discretising integral equations, e.g. Laplace, Helmholtz and Maxwell equations. A special focus of HLIBpro lies in the parallelisation of these methods to shared (threads) and distributed memory machines (MPI).


HoloViews is a Python library that makes analyzing and visualizing scientific or engineering data much simpler, more intuitive, and more reproducible. To use HoloViews, you first wrap your data in a HoloViews component along with optional metadata describing it. It will then display itself automatically on its own or in combination with any other HoloViews component. The separate matplotlib library does the plotting, but none of the data structures depend on the plotting code, so that you can easily create, save, load, and manipulate HoloViews objects from within your own programs. HoloViews objects support arbitrary combination, selection, slicing, sorting, sampling, and animation, to allow you to focus on whatever aspect of your data you wish. Instead of writing or maintaining complex plotting code, just declare what data you want to see, and HoloViews will handle the rest.


Higher Order (Symplectic) Methods in Python are explicit algorithms for higher order symplectic integration of a large class of Hamilton’s equations have recently been discussed by Mushtaq et al. Here we present a Python program for automatic numerical implementation of these algorithms for a given Hamiltonian, both for double precision and multiprecision computations. We provide examples of how to use this program, and illustrate behavior of both the code generator and the generated solver module(s).


A general-purpose particle simulation toolkit. It scales from a single CPU core to thousands of GPUs. You define particle initial conditions and interactions in a high-level python script. Then tell HOOMD-blue how you want to execute the job and it takes care of the rest. Python job scripts give you unlimited flexibility to create custom initialization routines, control simulation parameters, and perform in situ analysis.


HOP is a multi-tier programming language for the Web 2.0 and the so-called diffuse Web. It is designed for programming interactive web applications in many fields such as multimedia (web galleries, music players, …​), ubiquitous and house automation (SmartPhones, personal appliance), mashups, office (web agendas, mail clients, …​), etc.

HOP features include:

  • an extensive set of widgets for programming fancy and portable Web GUIs,

  • full compatibility with traditional Web technologies (JavaScript, HTML, CSS),

  • HTML5 support,

  • a versatile Web server supporting HTTP/1.0 and HTTP/1.1,

  • native multimedia support for enabling ubiquitous Web multimedia applications,

  • fast WebDAV level 1 support,

  • an optimizing native code compiler for server code,

  • an on-the-fly JavaScript compiler for client code,

  • an extensive set of libraries for the mail, calendars, databases, Telephony


A Python Just-In-Time compiler for astrophysical computations. In order to combine the ease of Python and the speed of C++, we developed HOPE, a specialised Python just-in-time (JIT) compiler designed for numerical astrophysical applications. HOPE focuses on a subset of the language and is able to translate Python code into C++ while performing numerical optimisation on mathematical expressions at runtime. To enable the JIT compilation, the user only needs to add a decorator to the function definition. We assess the performance of HOPE by performing a series of benchmarks and compare its execution speed with that of plain Python, C++ and the other existing frameworks. We find that HOPE improves the performance compared to plain Python by a factor of 2 to 120, achieves speeds comparable to that of C++, and often exceeds the speed of the existing solutions.


HPC-GAP is the EPSRC funded project to reengineer the software for computation in algebra and discrete mathematics to take advantage of the power of current and future high-performance computers. Our main focus is on the GAP system and the more recent SymGridPar middleware, which provide flexible and effective computation on single processors and small clusters. We will adapt the software to efficiently use large clusters of multi-core processors to perform larger computations. To demonstrate the effectiveness of our adaptations we will apply our new software to problems from a number of important areas of pure mathematics.


The High Performance Geostatistics Library is written in C++/Python to realize some geostatistical algorithms. The algorithms are called in Python, by executing the corresponding commands.


HPGMG implements full multigrid (FMG) algorithms using finite-volume and finite-element methods. Different algorithmic variants adjust the arithmetic intensity and architectural properties that are tested. These FMG methods converge up to discretization error in one F-cycle, thus may be considered direct solvers. An F-cycle visits the finest level a total of two times, the first coarsening (8x smaller) 4 times, the second coarsening 6 times, etc.

HPGMG-FV solves constant- and variable-coefficient elliptic problems on isotropic Cartesian grids using Full Multigrid (FMG). The method is second-order accurate in the max norm, as demonstrated by the FMG convergence. FMG interpolation (prolongation) is linear and V-cycle interpolation and restriction are piecewise constant. Recursive decomposition is used to construct a space filling curve akin to Z-Mort in order to distribute work among processes. Chebyshev polynomials are used for smoothing, preconditioned by the diagonal. FMG convergence is observed with a fourth order Chebyshev polynomial using a V(4,4) cycle. Thus convergence is reached in a total of 9 fine-grid operator applications (4 presmooths, residual, 4 postsmooths). This makes HPGMG-FV extremeley fast and energy efficient.


HPX (High Performance ParalleX) is a general purpose C++ runtime system for parallel and distributed applications of any scale. It strives to provide a unified programming model which transparently utilizes the available resources to achieve unprecedented levels of scalability. This library strictly adheres to the C++11 Standard and leverages the Boost C++ Libraries which makes HPX easy to use, highly optimized, and very portable. HPX is developed for conventional architectures including Linux-based systems, Windows, Mac, and the BlueGene/Q, as well as accelerators such as the Xeon Phi.

The goal of HPX is to create a high quality, freely available, open source implementation of the ParalleX model for conventional systems, such as classic Linux based Beowulf clusters or multi-socket highly parallel SMP nodes. At the same time, we want to have a very modular and well designed runtime system architecture which would allow us to port our implementation onto new computer system architectures. We want to use real world applications to drive the development of the runtime system, coining out required functionalities and converging onto a stable API which will provide a smooth migration path for developers. The API exposed by HPX is modelled after the interfaces defined by the C++11/14 ISO standard and adheres to the programming guidelines used by the Boost collection of C++ libraries.


HPXPI is an implementation of the XPI specification on top of the HPX runtime system. It is currently based on the XPI document version r313.

XPI (eXtreme Parallex Interface) is a programming interface for parallel applications and systems based on the ParalleX execution model. XPI provides a simple abstraction layer to the family of ParalleX implementation HPX runtime system software. As HPX evolves, XPI insulates application codes from such changes, ensuring stability of experimental application codes. XPI serves both as a target for source-to-source compilers of high-level languages and as a readible low-level programming interface syntax. XPI is experimental and supports current on-going sponsored research projects. Its long term future is entirely dependent on its resulting value; an unknown at this time. But it is motivated by a shortterm need to advance key project goals.


Bindings for HPX for various scripting languages, currently Python and Lua.


HSL (formerly the Harwell Subroutine Library) is a collection of state-of-the-art packages for large-scale scientific computation written and developed by the Numerical Analysis Group at the STFC Rutherford Appleton Laboratory and other experts. HSL offers users a high standard of reliability and has an international reputation as a source of robust and efficient numerical software. Among its best known packages are those for the solution of sparse linear systems of equations and sparse eigenvalue problems. MATLAB interfaces are offered for selected packages.

The Library was started in 1963 and was originally used at the Harwell Laboratory on IBM mainframes running under OS and MVS. Over the years, the Library has evolved and has been extensively used on a wide range of computers, from supercomputers to modern PCs. Recent additions include optimised support for multicore processors.

HSL packages are available at no cost for academic research and teaching. See download links for individual packages in the catalogue.


Bring the best of JavaScript data visualization to R. Use JavaScript visualization libraries at the R console, just like plots. Embed widgets in R Markdown documents and Shiny web applications. Develop new widgets using a framework that seamlessly bridges R and JavaScript.

HTTP servers


Other web servers were designed for the Web, but Caddy was designed for humans, with today’s Web in mind. Caddy supports HTTP/2, IPv6, Markdown, WebSockets, FastCGI, templates and more, right out of the box.


Seastar is an advanced, open-source C++ framework for high-performance server applications on modern hardware. Applications using Seastar can run on Linux or OSv.


Hugo is a static site generator written in Go. It is optimized for speed, easy use and configurability. Hugo takes a directory with content and templates and renders them into a full HTML website.

Hugo relies on Markdown files with front matter for meta data. And you can run Hugo from any directory. This works well for shared hosts and other systems where you don’t have a privileged account.


The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, …​) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information as well as the locality of I/O devices such as network interfaces, InfiniBand HCAs or GPUs. It primarily aims at helping applications with gathering information about modern computing hardware so as to exploit it accordingly and efficiently.


An open source distributed consistent key-value datastore. It differentiates from other distributed key-value datastores by claiming to offer consistency and multi-dimensional hashing, on top of the usual performance, availability, and throughput guarantees. A performance test that measures performance of HyperDex using identical setup to an independent study that evaluates the performance of Cassandra, MongoDB, and HBase side-by-side shows HyperDex to have superior throughput and latency. Multi-dimensional hashing is achieved through a different mechanism called hyperspace hashing than BigTable’s multiple column approach. The consistency guarantee is achieved through a novel chaining protocol.

HyperDex provides one of the richest APIs in the NoSQL space. Its support datatypes is unparalleled by other sharded data stores. It supports bulk asynchronous operations. And it boasts the fastest, consistent, online backups in the industry. All of these are accessible with bindings from C, C++, Python, Ruby, Java, Go, and Rust.


HSQLDB (HyperSQL DataBase) is the leading SQL relational database software written in Java. It offers a small, fast multithreaded and transactional database engine with in-memory and disk-based tables and supports embedded and server modes. It includes a powerful command line SQL tool and simple GUI query tools.


IBEX is a C++ library for constraint processing over real numbers. It provides reliable algorithms for handling non-linear constraints. In particular, roundoff errors are also taken into account. It is based on interval arithmetic and affine arithmetic. The main feature of Ibex is its ability to build strategies declaratively through the contractor programming paradigm. It can also be used as a black-box solver.

It can be used to solve a variety of problems that can roughly be formulated as to find a reliable characterization with boxes (Cartesian product of intervals) of sets implicitely defined by constraints. Reliable means that all sources of uncertainty should be taken into account, including:

  • approximation of real numbers by floating-point numbers

  • round-off errors

  • linearization truncatures

  • model parameter uncertainty

  • measurement noise


ICALAB for Signal Processing and ICALAB for Image Processing are two independent demo packages for MATLAB that implement a number of efficient algorithms for ICA (independent component analysis) employing HOS (higher order statistics), BSS (blind source separation) employing SOS (second order statistics) and LP (linear prediction), and BSE (blind signal extraction) employing various SOS and HOS methods.


A Java-based software framework for analyzing and visualizing geoscience data. It uses the VisAD and NetCDF Java libraries and other Java-based utility packages.

The IDV application is a geoscience display and analysis software system with many of the standard data displays that other Unidata software (e.g. GEMPAK and McIDAS) provide. It brings together the ability to display and work with satellite imagery, gridded data (for example, numerical weather prediction model output), surface observations, balloon soundings, NWS WSR-88D Level II and Level III RADAR data, and NOAA National Profiler Network data, all within a unified interface. It also provides 3-D views of the earth system and allows users to interactively slice, dice, and probe the data, creating cross-sections, profiles, animations and value read-outs of multi-dimensional data sets. The IDV can display any Earth-located data if it is provided in a known format.


Image processing and analysis in Java. ImageJ can display, edit, analyze, process, save and print 8-bit, 16-bit and 32-bit images. It can read many image formats including TIFF, GIF, JPEG, BMP, DICOM, FITS and "raw". It supports "stacks", a series of images that share a single window. It is multithreaded, so time-consuming operations such as image file reading can be performed in parallel with other operations.

It can calculate area and pixel value statistics of user-defined selections. It can measure distances and angles. It can create density histograms and line profile plots. It supports standard image processing functions such as contrast manipulation, sharpening, smoothing, edge detection and median filtering.

It does geometric transformations such as scaling, rotation and flips. Image can be zoomed up to 32:1 and down to 1:32. All analysis and processing functions are available at any magnification factor. The program supports any number of windows (images) simultaneously, limited only by available memory.


An ImageJ ilbrary to detect and analyze connected components (blobs) in binary images.

IJBlob: An ImageJ Library for Connected Component Analysis and Shape Analysis (online paper) -


An open-source, distributed, time series database with no external dependencies. InfluxDB is a time series, metrics, and analytics database. It’s written in Go and has no external dependencies. That means once you install it there’s nothing else to manage (like Redis, ZooKeeper, HBase, or whatever). InfluxDB is targeted at use cases for DevOps, metrics, sensor data, and real-time analytics.


Instrumentino is an open-source modular graphical user interface framework for controlling Arduino based experimental instruments. It expands the control capability of Arduino by allowing instruments builders to easily create a custom user interface program running on an attached personal computer.It enables the definition of operation sequences and their automated running without user intervention.

Acquired experimental data and a usage log are automatically saved on the computer for further processing.

Complex devices, which are difficult to control using an Arduino, may be integrated as well by incorporating third party application programming interfaces (APIs) into the Instrumentino framework.

Interactive Spaces

Interactive Spaces is a software platform which allows you to merge the virtual world with the physical world. By making it easy to connect sensors to applications running on different machines in a space, quite complex behaviors can be built.

Interactive Spaces applications are build from units called Activities which can easily communicate with each other no matter where they are on the local network. Through the use of Interactive Spaces communication system, called a route, any activity in the space can speak to or listen to messages from any other activities that it chooses to. This means you can easily control and synchronize events across a collection of machines.


The Internet-in-a-Box is a small, inexpensive device which provides essential Internet resources without any Internet connection. It provides a local copy of half a terabyte of the world’s Free information. It provides:

  • Wikipedia: Complete Wikipedia in a dozen different languages

  • Maps: Zoomable world-wide maps down to street level

  • E-books: Over 35 thousand e-books in a variety of languages

  • Software: Huge library of Open Source Software, including installable Ubuntu Linux OS with all software package repositories. Includes full source code for study or modification.

  • Video: Hundreds of hours of instructional videos

  • Chat: Simple instant messaging across the community

While the complete dataset for the Internet-in-a-Box project is over 700 GB, you can install the 500 MB QuickStart Sampler dataset to try the software without needing the full dataset.

There are several methods to install and run Internet-in-a-Box (IIAB). For installation: you can install IIAB as a Python package using the Python package manager pip. Or you can install from source using git. To run: you can run IIAB as a stand-alone server, or you can integrate it with your Apache installation using WSGI as a gateway.


Trendy stuff about wee hardware running this sort of software.


A collaborative open-source software framework that makes it easy for devices and apps to discover and communicate with each other. It supports many language bindings and can be easily integrated into platforms small and large. The AllJoyn framework defines a common way for devices and apps to communicate with one another ushering a new wave of interoperable devices to make the Internet of Things a reality.

The AllJoyn framework handles the complexities of discovering nearby devices, creating sessions between devices, and communicating securely between those devices. It abstracts out the details of the physical transports and provides a simple-to-use API. Multiple connection session topologies are supported, including point-to-point and group sessions. The security framework is flexible, supporting many mechanisms and trust models. And the types of data transferred are also flexible, supporting raw sockets or abstracted objects with well-defined interfaces, methods, properties, and signals.

One of the defining traits of the AllJoyn framework is its inherent flexibility. It was designed to run on multiple platforms, ranging from small embedded RTOS platforms to full-featured OSes. It supports multiple language bindings and transports. And since the AllJoyn framework is open-source, this flexibility can be extended further in the future to support even more transports, bindings, and features.


ActiveMQ Apollo is a faster, more reliable, easier to maintain messaging broker built from the foundations of the original ActiveMQ. It accomplishes this using a radically different threading and message dispatching architecture. Like ActiveMQ, Apollo is a multi-protocol broker and supports STOMP, AMQP, MQTT, Openwire, SSL, and WebSockets.


The Constrained Application Protocol (CoAP) is a specialized web transfer protocol for use with constrained nodes and constrained networks in the Internet of Things. The protocol is designed for machine-to-machine (M2M) applications such as smart energy and building automation.

See also Erbium and txThings.


Californium (Cf) is an open source implementation of the Constrained Application Protocol (CoAP). It is written in Java and targets unconstrained environments such as back-end service infrastructures (e.g., proxies, resource directories, or cloud services) and less constrained environments such as embedded devices running Linux (e.g., smart home/factory controllers or cellular gateways). Californium (Cf) has been running code for the IETF standardization of CoAP and was recently reimplemented from scratch having all the experience. In particular, Cf focuses now on service scalability for large-scale Internet of Things applications. The new implementation was successfully tested at the ETSI CoAP and OMA LWM2M Plugtests in November 2013 and March 2014. It complies with all mandatory and optional test cases.


This implements a lightweight application-protocol for devices that are constrained their resources such as computing power, RF range, memory, bandwith, or network packet sizes. This protocol, CoAP was standardized in the IETF as RFC 7252.


GSN is a Java environment that runs on one or more computers composing the backbone of the acquisition network. A set of wrappers allow to feed live data into the system. Then, the data streams are processed according to XML specification files. The system is built upon a concept of sensors (real sensors or virtual sensors, that is a new data source created from live data) that are connected together in order to built the required processing path. For example, one can imagine an anemometer that would sent its data into GSN through a wrapper (various wrappers are already available and writing new ones is quick), then that data stream could be sent to an averaging mote, the output of this mote could then be split and sent for one part to a database for recording and to a web site for displaying the average measured wind in real time. All of this example could be done by editing only a few XML files in order to connect the various motes together.

GSN is designed to make the sensor network application development a pleasure. The applications based on GSN are hardware-independent making the sensor network changes invisible to the application, for instance you can change the underlying sensor network from the Mica2 nodes to the BTNodes (with compatible sensing boards) without ever touching a single line of code in the application.

Now you have all the common sensor network requirements in one package plus the support for dozens of well known sensing hardware.


IoTivity is an open source software framework enabling seamless device-to-device connectivity to address the emerging needs of the Internet of Things.


An integration middleware for the Internet of Things. It provides a communication stack for embedded devices based on IPv6, Web services and oBIX to provide interoperable interfaces for smart objects. Using 6LoWPAN for constrained wireless networks and the Constrained Application Protocol together with Efficient XML Interchange an efficient stack is provided allowing using interoperable Web technologies in the field of sensor and actuator networks and systems while remaining nearly as efficient regarding transmission message sizes as existing automation systems. The IoTSyS middleware aims providing a gateway concept for existing sensor and actuator systems found in nowadays home and building automation systems, a stack which can be deployed directly on embedded 6LoWPAN devices and further addresses security, discovery and scalability issues.


Kura is a Java/OSGi-based framework for IoT gateways. Kura APIs offer access to the underlying hardware (serial ports, GPS, watchdog, GPIOs, I2C, etc.), management of network configurations, communication with M2M/IoT Integration Platforms, and gateway management.


A project to improve the ease of wireless sensor network programming by allowing programmers to express high-level objectives, and leave the low-level details to the compiler and run-time system. The goal is also to enable easy integration with other systems, such as business systems, mainly via the usage of business process modeling.

A makeSense tutorial is available. The tutorial comes in a virtual machine (and the file to download is therefore quite large) that has all the software you need to learn and understand how to develop sensor network applications using makeSense. The tutorial has two parts. In the first part, domain experts can use our model editor (with extended BPMN) to develop sensor network applications. In the second part, programmers (also with limited or no knowledge on sensor networks) can learn how to use the makeSense macroprogramming language on how to develop sensor network applications.


The Mihini project delivers an embedded runtime running on top of Linux, that exposes a high-level Lua API for building Machine-to-Machine applications.


MQTT stands for MQ Telemetry Transport. It is a publish/subscribe, extremely simple and lightweight messaging protocol, designed for constrained devices and low-bandwidth, high-latency or unreliable networks. The design principles are to minimise network bandwidth and device resource requirements whilst also attempting to ensure reliability and some degree of assurance of delivery. These principles also turn out to make the protocol ideal of the emerging “machine-to-machine” (M2M) or “Internet of Things” world of connected devices, and for mobile applications where bandwidth and battery power are at a premium.


The Mosquitto project provides an open-source implementation of an MQTT broker. It implements the MQ Telemetry Transport protocol versions 3.1 and 3.1.1. MQTT provides a lightweight method of carrying out messaging using a publish/subscribe model. This makes it suitable for "machine to machine" messaging such as with low power sensors or mobile devices such as phones, embedded computers or microcontrollers like the Arduino.

The Mosquitto broker is the focus of the project and aims to be a lightweight and function MQTT broker that can run on relatively constrained systems, but still be powerful enough for a wide range of applications. The mosquitto_pub and mosquitto_sub command line utilities provide a straightforward and powerful way of interacting with your broker. The client library that the utilities use for their MQTT support can be used to develop your own MQTT applications.


A web interface for MQTT. A simple web interface which is able to subscribe to a MQTT topic and display the information.


An open source utility intended to help you with monitoring activity on MQTT topics. It’s been designed to deal with high volumes of messages, as well as occasional publications. A JavaFX application that should work on any operating system with an appropriate version of Java 8 installed. mqtt-spy-daemon is a Java-based command line tool that does not require a GUI environment. Basic functionality works with Java 7, whereas some of the advanced features like scripting require Java 8 to be installed.


The Paho project provides open-source client implementations of open and standard messaging protocols aimed at new, existing, and emerging applications for Machine‑to‑Machine (M2M) and Internet of Things (IoT).


Node-RED is a tool for wiring together hardware devices, APIs and online services in new and interesting ways. Node-RED provides a browser-based flow editor that makes it easy to wire together flows using the wide range nodes in the palette. Flows can be then deployed to the runtime in a single-click.


The Internet of Things (IoT) is set to occupy a substantial component of future Internet. The IoT connects sensors and devices that record physical observations to applications and services of the Internet. As a successor to technologies such as RFID and Wireless Sensor Networks (WSN), the IoT has stumbled into vertical silos of proprietary systems, providing little or no interoperability with similar systems. As the IoT represents future state of the Internet, an intelligent and scalable architecture is required to provide connectivity between these silos, enabling discovery of physical sensors and interpretation of messages between things. This paper proposes a gateway and Semantic Web enabled IoT architecture to provide interoperability between systems using established communication and data standards. The Semantic Gateway as Service (SGS) allows translation between messaging protocols such as XMPP, CoAP and MQTT via a multi-protocol proxy architecture. Utilization of broadly accepted specifications such as W3C’s Semantic Sensor Network (SSN) ontology for semantic annotations of sensor data provide semantic interoperability between messages and support semantic reasoning to obtain higher-level actionable knowledge from low-level sensor data.


A software for integrating different home automation systems and technologies into one single solution that allows over-arching automation rules and that offers uniform user interfaces. The open Home Automation Bus (openHAB) project aims at providing a universal integration platform for all things around home automation. It is a pure Java solution, fully based on OSGi. The Equinox OSGi runtime and Jetty as a web server build the core foundation of the runtime.

It is designed to be absolutely vendor-neutral as well as hardware/protocol-agnostic. openHAB brings together different bus systems, hardware devices and interface protocols by dedicated bindings. These bindings send and receive commands and status updates on the openHAB event bus. This concept allows designing user interfaces with a unique look&feel, but with the possibility to operate devices based on a big number of different technologies. Besides the user interfaces, it also brings the power of automation logics across different system boundaries.


OpenRemote is software integration platform for residential and commercial building automation. OpenRemote platform is automation protocol agnostic, operates on off-the-shelf hardware and is freely available under an Open Source license. OpenRemote’s architecture enables fully autonomous and user-independent intelligent buildings. End-user control interfaces are available for iOS and Android devices, and for devices with modern web browsers. User interface design, installation management and configuration can be handled remotely with OpenRemote cloud-based design tools.

The supported protocols/devices include TCP/IP, Telnet, HTTP/REST, RS-232, AMX, KNX, Lutron, Z-Wave, 1-Wire, EnOcean, xPL, Insteon. X10, infrared, Russound, GlobalCache, ITRrans, VLC, panStamp, Denon AVR, Freebox, MythTV and more.


The OSGi specification describes a modular system and a service platform for the Java programming language that implements a complete and dynamic component model, something that does not exist in standalone Java/VM environments. Applications or components, coming in the form of bundles for deployment, can be remotely installed, started, stopped, updated, and uninstalled without requiring a reboot; management of Java packages/classes is specified in great detail. Application life cycle management is implemented via APIs that allow for remote downloading of management policies. The service registry allows bundles to detect the addition of new services, or the removal of services, and adapt accordingly.

The OSGi specifications have evolved beyond the original focus of service gateways, and are now used in applications ranging from mobile phones to the open-source Eclipse IDE. Other application areas include automobiles, industrial automation, building automation, PDAs, grid computing, entertainment, fleet management and application servers.


Remote Services for OSGi runs as an OSGi bundle and facilitates distribution for arbitrary OSGi framework implementations. All that a service provider framework has to do is registering a service for remote access. Subsequently, other peers can connect to the service provider peer and get access to the service. Remote services are accessed in an entirely transparent way. For every remote service, a local proxy bundle is generated that registers the same service. Local service clients can hence access the remote service in the same way and without regarding distribution.

Even though Remote Services for OSGi is a sophisticated middleware for OSGi frameworks, it uses a very efficient network protocol and has a small footprint. This makes it ideal for small and embedded devices with limited memory and network bandwidth. The service runs on every OSGi-compliant environment. Remote Services for OSGi has been tested with Eclipse Equinox, Knopflerfish, and Oscar / Apache Felix, as well as with our own lightweight OSGi implementation Concierge. Our test platforms include a variety of different devices, hardware architectures and Java VMs.


Our goal is to provide a programming environment where sensors can be programmed as an ensemble rather than individually. Here programmers will focus on the applications and on the services provided by the collection of sensors rather than on which particular sensor to program or on how communication will take place. Energy levels and scheduling will to some extent be controlled by the programmers, e.g., to give priority to specific application tasks. Briefly put, we will employ macro programming of networks of sensors rather than micro programming of individual sensors. We will focus on wireless sensor networks, but the general concept of ensemble programming is applicable to other multiprocessor areas such as grid computing, multicore, and server farms, like Google, etc. We will demonstrate that sensor networks can be programmed as an ensemble under severe constraints, including being devices with very limited capabilities, frequently failing sensors, mobility, energy constraints and harsh security threats.

ProFuN TG is a high-level programming environment for wireless sensor networks. It helps the application programmer with software design, deployment and maintenance. The tool is customizable: both user-defined tasks and user-defined objective functions for task mapping are possible.


Eclipse SmartHome is a framework for building smart home solutions. As such, it consists of a rich set of OSGi bundles that serve different purposes. Not all solutions that build on top of Eclipse SmartHome will require all of those bundles - instead they can choose what parts are interesting for them.


TANGO is a software toolkit for connecting things together, building control systems, and integrating system. It is free , open source and object-oriented. It is easy to use and is well adapted to solving simple and complex distributed problems. TANGO Controls has been used to build solutions for:

  • Distributed Control Systems (DCS) in which devices are controlled and monitored in a local distributed network

  • Supervisory Control And Data Acquisition (SCADA) systems in which remote devices are controlled and monitored centrally

  • Integrated Control Systems (ICS) in which different autonomous control systems are integrated into a central one

  • Interface Devices that run on small embedded platforms into a distributed control system

  • Internet of Things (IoT) applications in which arbitrary devices are controlled through the Internet

  • Machine to Machine (M2M) applications in which devices communicates with each other

  • System Integration Platforms in which different kind of software applications and systems are integrated into a central one

TANGO Controls is operating system independent and supports C++, Java and Python for all of the components.


Taurus is a python framework for both CLI and GUI tango applications. It is build on top of PyTango and PyQt. Taurus stands for TAngo User interface ‘R’ US.

the thing system

The Thing System is a set of software components and network protocols. Our steward software is written in node.js making it both portable and easily extensible. It can run on your laptop, or fit onto a small single board computer like the Raspberry Pi.

The steward is at the heart of the system and connects to Things in your home, whether those things are media players such as the Sonos or the Apple TV, your Nest thermostat, your INSTEON home control system, or your Philips Hue lightbulbs — whether your things are connected together via Wi-Fi, USB or Bluetooth Low Energy (BLE). The steward will find them and bring them together so they can talk to one another and perform magic.

Dozens of "things" are supported, with more on the way.


WaveScope is a system for developing distributed, high-rate applications that need to process streams of data from various sources (e.g., sensors) using a combination of signal processing and database (event stream processing) operations. The execution environment for these applications ranges from embedded sensor nodes to multicore/multiprocessor servers.

WaveScript is the programming language used to develop WaveScope applications. It is a high-level, functional, stream-processing language that aims to deliver uncompromising performance. WaveScript programs execute in parallel on multiple cores, or distributed across a network. Its compiler uses aggressive partial evaluation techniques to remove abstractions and reduce the source program to a graph of stream operators.

The WaveScript compiler supports multiple backends generating code for several languages.

  • First, the flagship WaveScript backend offering the best performance generates native code using a C compiler backend.

  • Second, an embedding of WaveScript in Scheme is included with the compiler and enables low-latency compile-link-load of new programs.

  • Third, because WaveScript is similar to ML, translation is straightforward and WaveScript can generate code for SML (MLton) or OCaml.[]

IoT Operating Systems


Contiki is an open source operating system for the Internet of Things. Contiki connects tiny low-cost, low-power microcontrollers to the Internet. Contiki provides powerful low-power Internet communication. Contiki supports fully standard IPv6 and IPv4, along with the recent low-power wireless standards: 6lowpan, RPL, CoAP. With Contiki’s ContikiMAC and sleepy routers, even wireless routers can be battery-operated.

Contiki is designed to run on classes of hardware devices that are severely constrained in terms of memory, power, processing power, and communication bandwidth. A typical Contiki system has memory on the order of kilobytes, a power budget on the order of milliwatts, processing speed measured in megahertz, and communication bandwidth on the order of hundreds of kilobits/second. This class of systems includes both various types of embedded systems as well as a number of old 8-bit computers. Despite providing multitasking and a built-in TCP/IP stack, Contiki only needs about 10 kilobytes of RAM and 30 kilobytes of ROM.[1] A full system, complete with a graphical user interface, needs about 30 kilobytes of RAM.


CALIPSO builds Internet Protocol (IP) connected smart object networks, but with novel methods to attain very low power consumption, thereby providing both interoperability and long lifetimes. CALIPSO leans on the significant body of work on sensor networks to integrate radio duty cycling and data-centric mechanisms into the IPv6 stack, something that existing work has not previously done. CALIPSO works at three layers: the network, the routing, and the application layer. We also revisit architectural decisions on naming, identification, and the use of middle-boxes.

CALIPSO works within the IETF/IPv6 framework, which includes the recent IETF RPL and CoAP protocols. This gives a structure for evaluation that has not previously been available. We use Contiki open source OS, Europe’s leading smart object OS, as the target development environment for prototyping and experimental evaluation.


Erbium (Er) is a low-power REST Engine for Contiki that was developed together with SICS (and a rare earth element that is found in the Ytterby mine near Stockholm). The REST Engine includes a comprehensive embedded CoAP implementation, which became the official one for the Contiki OS. It supports RFC 7252 together with blockwise transfers and observing.


RIOT is an operating system designed for the particular requirements of Internet of Things (IoT) scenarios. These requirements comprise a low memory footprint, high energy efficiency, real-time capabilities, a modular and configurable communication stack, and support for a wide range of low-power devices. RIOT provides a microkernel, utilities like cryptographic libraries, data structures (bloom filters, hash tables, priority queues), or a shell, different network stacks, and support for various microcontrollers, radio drivers, sensors, and configurations for entire platforms, e.g. TelosB or STM32 Discovery Boards.


IPFS is a distributed file system that seeks to connect all computing devices with the same system of files. In some ways, this is similar to the original aims of the Web, but IPFS is actually more similar to a single bittorrent swarm exchanging git objects.

It combines good ideas from Git, BitTorrent, Kademlia, SFS, and the Web. It is like a single bittorrent swarm, exchanging git objects. IPFS provides an interface as simple as the HTTP web, but with permanence built in.


IPredator provides you with an encrypted tunnel from your computer to the Internet. We are hiding your real IP address behind one of ours.


IRAF is the Image Reduction and Analysis Facility, a general purpose software system for the reduction and analysis of scientific data. IRAF is written and supported by the IRAF programming group at the National Optical Astronomy Observatories (NOAO) in Tucson, Arizona. IRAF includes a good selection of programs for general image processing and graphics applications, plus a large number of programs for the reduction and analysis of optical astronomy data within the NOAO package. External or layered packages are also available for the analysis of HST, XRAY and EUV data. IRAF provides a complete programming environment, which includes the Command Language script facility, the IMFORT Fortran programming interface, and the fully featured SPP/VOS programming environment in which the portable IRAF system is written.

See PyRAF.


Iris seeks to provide a powerful, easy to use, and community-driven Python library for analysing and visualising meteorological and oceanographic data sets.


IRPF90 is a Fortran programming environment which helps the development of large Fortran codes by applying the Implicit Reference to Parameters method (IRP).

In Fortran programs, the programmer has to focus on the order of the instructions: before using a variable, the programmer has to be sure that it has already been computed in all possible situations. For large codes, it is common source of error.

In IRPF90 most of the order of instructions is handled by the pre-processor, and an automatic mechanism guarantees that every entity is built before being used. This mechanism relies on the needs/needed by relations between the entities, which are built automatically.

Codes written with IRPF90 execute often faster than Fortran programs, are faster to write and easier to maintain.


The Integrated System for Imagers and Spectrometers (ISIS) is a free, specialized, digital image processing software package developed by the USGS for NASA. ISIS key feature is the ability to place many types of data in the correct cartographic location, enabling disparate data to be co-analyzed. ISIS also includes standard image processing applications such as contrast, stretch, image algebra, filters, and statistical analysis. ISIS can process two-dimensional images as well as three-dimensional cubes derived from imaging spectrometers. The production of USGS topographic maps of extraterrestrial landing sites relies on ISIS software. ISIS is able to process data from NASA and International spacecraft missions including Lunar Orbiter, Apollo, Voyager, Mariner 10, Viking, Galileo, Magellan, Clementine, Mars Global Surveyor, Cassini, Mars Odyssey, Mars Reconnaissance Orbiter, MESSENGER, Lunar Reconnaissance Orbiter, Chandrayaan, Dawn, and Kaguya.

The ISIS software is a valuable resource for planetary missions that require systematic data processing, products for planning, and research and analysis of derived data products. By using ISIS, missions can leverage millions of dollars of software development that NASA has paid for. However, before the power of ISIS can be applied to an instrument, a camera model and custom programs to ingest mission-specific ancillary data are necessary. Once an instrument is added to ISIS, it can support data processing pipelines, radiometric calibration, photometric calibration, band-to-band registration of multispectral data, ortho-rectification, construction of scientifically accurate and cosmetically pleasing mosaics, generation of control networks solutions and creation of topographic models.


A dynamic computer programming language.[5] It is most commonly used as part of web browsers, whose implementations allow client-side scripts to interact with the user, control the browser, communicate asynchronously, and alter the document content that is displayed.[5] It is also used in server-side network programming with runtime environments such as Node.js, game development and the creation of desktop and mobile applications. With the rise of the single-page web app and JavaScript-heavy sites, it is increasingly being used as a compile target for source-to-source compilers from both dynamic languages and static languages. In particular, Emscripten and highly optimised JIT compilers, in tandem with asm.js which is friendly to AOT compilers like OdinMonkey, have enabled C and C++ programs to be compiled into JavaScript and execute at near-native speeds, making JavaScript be considered the "assembly language of the web",[6] according to its creator and others.

The Hitchhiker’s Guide to Modern JavaScript Tooling -

Introduction to JavaScript for Fortran programmers -


A low-level, extraordinarily optimizable subset of JavaScript. It is an intermediate programming language consisting of a strict subset of the JavaScript language. It enables significant performance improvements for web applications that are written in statically-typed languages with manual memory management (such as C) and then translated to JavaScript by a source-to-source compiler. Asm.js does not aim to improve the performance of hand-written JavaScript code, nor does it enable anything other than enhanced performance.

It is intended to have performance characteristics closer to that of native code than standard JavaScript by limiting language features to those amenable to ahead-of-time optimization and other performance improvements.[2] By using a subset of JavaScript, asm.js is already supported by all major web browsers,[3] unlike alternative approaches such as Google Native Client. Mozilla Firefox was the first web browser to implement asm.js-specific optimizations, starting with Firefox 22.[4] The optimizations of Google Chrome’s V8 JavaScript engine in Chrome 28 made asm.js benchmarks more than twice as fast as prior versions of Chrome.

See Emscripten.


Babel.js is a transpiler - it takes your code written in a ECMAScript 2015 standard and produces the code in the older standard that browsers can run. It also allows you to enable experimental ECMAScript 2016 (a.k.a. ECMAScript 7 or ES7) features and has a built-in JSX transpiler. It can take JSX syntax that React uses and produce the JavaScript code out of it.


Carnival is an unobtrusive, developer-friendly way to add comments to any web site.


DynJS is an ECMAScript runtime for the JVM.


JerryScript is the lightweight JavaScript engine intended to run on a very constrained devices such as microcontrollers.


Nashorn’s goal is to implement a lightweight high-performance JavaScript runtime in Java with a native JVM. This Project intends to enable Java developers embedding of JavaScript in Java applications via JSR-223 and to develop free standing JavaScript applications using the jrunscript command-line tool.


NPM is a package manager. It does the same job as your system package managers do, but for JavaScript. It is a tool for downloading all pieces of your environment. NPM takes care of downloading packages, resolving dependencies for them and providing a package abstraction around your project. So when another developer wants to work with your codebase all he need to do is to issue the npm install command and all dependencies will install automatically. In such package you can also include license info, name, keywords, version, description and many other metadata about your code. If you are developing a library, npm also helps you to publish it later and make it available for all developers that work within the Node.js environment.


Rhino is an open-source implementation of JavaScript written entirely in Java. It is typically embedded into Java applications to provide scripting to end users. It is embedded in J2SE 6 as the default Java scripting engine.

JavaScript Frameworks

Things that run on top of JavaScript.


An open source JavaScript framework for Cross-platform mobile, desktop, TV and web applications emphasizing object-oriented encapsulation and modularity.


The Numeric Javascript library allows you to perform sophisticated numerical computations in pure javascript in the browser and elsewhere.


SpiderMonkey is Mozilla’s JavaScript engine written in C/C++. It is used in various Mozilla products, including Firefox, and is available under the MPL2.

SpiderMonkey is the code name for the first-ever JavaScript engine, written by Brendan Eich at Netscape Communications, later released as open source and now maintained by the Mozilla Foundation. SpiderMonkey provides JavaScript support for Mozilla Firefox and various embeddings such as the GNOME 3 desktop.

Eich "wrote JavaScript in ten days" in 1995, having been "recruited to Netscape with the promise of doing Scheme in the browser". (The idea of using Scheme was abandoned when "engineering management [decided] that the language must ‘look like Java’".) In the fall of 1996, Eich, needing to "pay off [the] substantial technical debt" left from the first year, "stayed home for two weeks to rewrite Mocha as the codebase that became known as SpiderMonkey". The name SpiderMonkey was chosen as a reference to the movie Beavis and Butt-head Do America, in which the character Tom Anderson mentions that the title characters were "whacking off like a couple of spider monkeys." In 2011, Eich transferred management of the SpiderMonkey code to Dave Mandelin.


The V8 JavaScript Engine is an open source JavaScript engine developed by Google for the Google Chrome web browser. V8 compiles JavaScript to native machine code (IA-32, x86-64, ARM, or MIPS ISAs) before executing it, instead of more traditional techniques such as interpreting bytecode or compiling the whole program to machine code and executing it from a filesystem. The compiled code is additionally optimized (and re-optimized) dynamically at runtime, based on heuristics of the code’s execution profile. Optimization techniques used include inlining, elision of expensive runtime properties, and inline caching, among many others.


Jekyll is a simple, blog-aware, static site generator perfect for personal, project, or organization sites. Think of it like a file-based CMS, without all the complexity. Jekyll takes your content, renders Markdown and Liquid templates, and spits out a complete, static website ready to be served by Apache, Nginx or another web server. Jekyll is the engine behind GitHub Pages, which you can use to host sites right from your GitHub repositories.


Complex systems are increasingly being viewed as distributed information processing systems, particularly in the domains of computational neuroscience, bioinformatics and Artificial Life. This trend has resulted in a strong uptake in the use of (Shannon) information-theoretic measures to analyse the dynamics of complex systems in these fields. We introduce the Java Information Dynamics Toolkit (JIDT): a Google code project which provides a standalone, (GNU GPL v3 licensed) open-source code implementation for empirical estimation of information-theoretic measures from time-series data. While the toolkit provides classic information-theoretic measures (e.g. entropy, mutual information, conditional mutual information), it ultimately focusses on implementing higher-level measures for information dynamics. That is, JIDT focusses on quantifying information storage, transfer and modification, and the dynamics of these operations in space and time. For this purpose, it includes implementations of the transfer entropy and active information storage, their multivariate extensions and local or pointwise variants. JIDT provides implementations for both discrete and continuous-valued data for each measure, including various types of estimator for continuous data (e.g. Gaussian, box-kernel and Kraskov-Stoegbauer-Grassberger) which can be swapped at run-time due to Java’s object-oriented polymorphism. Furthermore, while written in Java, the toolkit can be used directly in MATLAB, GNU Octave, Python and other environments. We present the principles behind the code design, and provide several examples to guide users.


Joblib provides a simple helper class to write parallel for loops using multiprocessing. The core idea is to write the code to be executed as a generator expression, and convert it to parallel computing.


Jolie is an open-source programming language for developing distributed applications based on microservices. In the programming paradigm proposed with Jolie, each program is a service that can communicate with other programs by sending and receiving messages over a network.


An application launcher for portable Java applications on every computer everywhere you go. jPort creates a Java enabled menu to launch dozens of free applications. jPort desktop does not require installation. Simply upload jPort on any desktop and hundreds of awesome applications will be under your fingertips.


In a decentralized computing environment, it’s a better practice to pass programming codes to various machines to execute (and then gather the results) when the application is dealing with huge amount of data. However, how can machines of various configurations understand each other? Also, the "moving code, least moving data" policy may work better with functional programming than imperative programming.

Those questions/issues lead to the idea of doing functional programming in JSON. If programs can be coded in JSON, they can be easily shipped around and understood by machines of vaious settings. Combining JSON and functional programming also makes security issues easier to track or manage.

JSON-FP is part of an attempt to make data freely and easily accessed, distributed, annotated, meshed, even re-emerged with new values. To achieve that, it’s important to be able to ship codes to where data reside, and that’s what JSON-FP is trying to achieve.


A high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The library, largely written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing.

Julia Packages -


Julia package for function approximation.

ApproxFun and solution of differential equations in Julia (slides, PDF, 2014, 41) - Sheehan Olver -

A practical framework for infinite-dimensional linear algebra (paper, PDF, 2014, 5) - Sheehan Olver & Alex Townsend -

The automatic solution of partial differential equations using a global spectral method (paper, PDF, 2014, 22) - Alex Townsend & Sheehan Olver -


Compose is a declarative vector graphics system written in Julia. It’s designed to simplify the creation of complex graphics and serves as the basis of the Gadfly data visualization package.


Escher lets you build beautiful interactive Web UIs in Julia.

Escher’s built-in web server allows you to create interactive UIs with very little code. It takes care of messaging between Julia and the browser under-the-hood. It can also hot-load code: you can see your UI evolve as you save your changes to it.

The built-in library functions support Markdown, Input widgets, TeX-style Layouts, Styling, TeX, Code, Behaviors, Tabs, Menus, Slideshows, Plots (via Gadfly) and Vector Graphics (via Compose) – everything a Julia programmer would need to effectively visualize data or to create user-facing GUIs. The API comprehensively covers features from HTML and CSS, and also provides advanced features. Its user merely needs to know how to write code in Julia.


Gadfly is a system for plotting and visualization based largely on Hadley Wickhams’s ggplot2 for R, and Leland Wilkinson’s book The Grammar of Graphics.


JuMP is a domain-specific modeling language for mathematical programming embedded in Julia. It currently supports a number of open-source and commercial solvers for a variety of problem classes, including linear programming, (mixed) integer programming, second-order conic programming, and nonlinear programming.

JuMP makes it easy to specify and solve optimization problems without expert knowledge, yet at the same time allows experts to implement advanced algorithmic techniques such as exploiting efficient hot-starts in linear programming or using callbacks to interact with branch-and-bound solvers. JuMP is also fast - benchmarking has shown that it can create problems at similar speeds to special-purpose commercial tools such as AMPL while maintaining the expressiveness of a generic high-level programming language. JuMP can be easily embedded in complex work flows including simulations and web servers.


On-line statistics for Julia.


The next generation of IPython notebooks. IPython will continue to exist as a Python kernel for Jupyter, but the notebook and other language-agnostic parts of IPython will move to new projects under the Jupyter name. IPython 3.0 will be the last monolithic release of IPython.

Jupyter Advanced Topics Tutorial (2:48:53 video) -

Teaching with IPython: Jupyter Notebooks and Jupyterhub (19:57 video) -

See also Flexx.


This repository contains custom Contents classes that allows IPython to use Google Drive for file management. The code is a organized as a python package that contains functions to install a Jupyter Notebook JavaScript extension, and activate/deactivate different IPython profiles to be used with Google drive.


Multi-user server for Jupyter notebooks.


Jupyter nbviewer is the web application behind The Jupyter Notebook Viewer, which is graciously hosted by Rackspace. Run this locally to get most of the features of nbviewer on your own network.


Creates temporary Jupyter Notebook servers using Docker containers. This launches a docker container for each user that requests one. In practice, this gets used to provide temporary notebooks, demo the IPython notebook as part of a Nature article, or even provide Jupyter kernels for publications.


A Java virtual machine (JVM) is an abstract computing machine. There are three notions of the JVM: specification, implementation, and instance. The specification is a book that formally describes what is required of a JVM implementation. Having a single specification ensures all implementations are interoperable. A JVM implementation is a computer program that implements requirements of the JVM specification in a compliant and preferably performant manner. An instance of the JVM is a process that executes a computer program compiled into Java bytecode.


Kahler, a Python library that implements discrete exterior calculus on arbitrary Hermitian manifolds. Borrowing techniques and ideas first implemented in PyDEC, Kahler provides a uniquely general framework for computation using discrete exterior calculus. Manifolds can have arbitrary dimension, topology, bilinear Hermitian metrics, and embedding dimension. Kahler comes equipped with tools for generating triangular meshes in arbitrary dimensions with arbitrary topology. Kahler can also generate discrete sharp operators and implement de Rham maps. Computationally intensive tasks are automatically parallelized over the number of cores detected. The program itself is written in Cython—​a superset of the Python language that is translated to C and compiled for extra speed. Kahler is applied to several example problems: normal modes of a vibrating membrane, electromagnetic resonance in a cavity, the quantum harmonic oscillator, and the Dirac-Kahler equation. Convergence is demonstrated on random meshes.


A simple and fast framework for spatial analysis in Python. It contains clean vector and raster data types that are coordinate system-aware, implementations of frequently-used of geospatial analysis methods, and the read/write interfaces to several formats, including GeoJSON, shapefiles, and ESRI ASCII.


Kartograph is a simple and lightweight framework for building interactive map applications without Google Maps or any other mapping service. It was created with the needs of designers and data journalists in mind. Actually, Kartograph is two libraries. One generates beautiful & compact SVG maps; the other helps you to create interactive maps that run across all major browsers.

The library is a Python library for generating beautiful, Illustrator-friendly SVG maps. The Kartograph.js library is a JavaScript library for creating interactive maps based on SVG maps.


Visualize logs and time-stamped data. Elasticsearch works seamlessly with Kibana to let you see and interact with your data.


Keyhole Markup Language (KML) is an XML notation for expressing geographic annotation and visualization within Internet-based, two-dimensional maps and three-dimensional Earth browsers. The KML file specifies a set of features (place marks, images, polygons, 3D models, textual descriptions, etc.) for display in Here Maps, Google Earth, Maps and Mobile, or any other geospatial software implementing the KML encoding. Each place always has a longitude and a latitude. Other data can make the view more specific, such as tilt, heading, altitude, which together define a "camera view" along with a timestamp or timespan. KML shares some of the same structural grammar as GML. Some KML information cannot be viewed in Google Maps or Mobile.

KML files are very often distributed in KMZ files, which are zipped KML files with a .kmz extension. These must be legacy (ZIP 2.0) compression compatible (i.e. stored or deflate method), otherwise the .kmz file might not uncompress in all geobrowsers. The contents of a KMZ file are a single root KML document (notionally "doc.kml") and optionally any overlays, images, icons, and COLLADA 3D models referenced in the KML including network-linked KML files. The root KML document by convention is a file named "doc.kml" at the root directory level, which is the file loaded upon opening. By convention the root KML document is at root level and referenced files are in subdirectories (e.g. images for overlay images).


Fastkml is a library to read, write and manipulate KML files. It aims to keep it simple and fast (using lxml if available). Fast refers to the time you spend to write and read KML files as well as the time you spend to get aquainted to the library or to create KML objects. It aims to provide all of the functionality that KML clients such as OpenLayers, Google Maps, and Google Earth provides.


The KPP kinetic preprocessor is a software tool that assists the computer simulation of chemical kinetic systems. The concentrations of a chemical system evolve in time according to the differential law of mass action kinetics. A numerical simulation requires an implementation of the differential laws and a numerical integration in time.

KPP translates a specification of the chemical mechanism into Fortran77, Fortran90, C, or Matlab simulation code that implements the concentration time derivative function, its Jacobian, and it Hessian, together with a suitable numerical integration scheme. Sparsity in Jacobian/Hessian is carefully exploited in order to obtain computational efficiency.

KPP incorporates a library with several widely used atmospheric chemistry mechanisms; the users can add their own chemical mechanisms to the library. KPP also includes a comprehensive suite of stiff numerical integrators. The KPP development environment is designed in a modular fashion and allows for rapid prototyping of new chemical kinetic schemes as well as new numerical integration methods.


Krita is a KDE program for sketching and painting, offering an end–to–end solution for creating digital painting files from scratch by masters. Fields of painting that Krita explicitly supports are concept art, creation of comics and textures for rendering. Modeled on existing real-world painting materials and workflows, Krita supports creative working by getting out of the way and with a snappy response. There are three versions of Krita: Krita Sketch, for touch devices, Krita Desktop desktop systems and finally Krita Studio, which is like Krita Desktop but supported by KO GmbH.


A free and open source video compositing software, similar in functionality to Adobe After Effects or Nuke by The Foundry. The project is a free node-based compositor that relies on OpenColorIO for color management, OpenImageIO for file formats support, and Qt for user interface. It also works with 32bit float per channel precision and supports OFX plugins, both free and commercial.


The KSTAR project supports the development of Klang, a source-to-source compiler that turns C programs with OpenMP pragmas to C programs with calls to either the StarPU or the Kaapi runtime system. The features include OpenMP 3.1, the OpenMP 4.0 depend clause, Accelerators extensions, and C/Cpp source-to-source translation based on Clang.


A unified runtime system for heterogeneous multicore architectures.

Software that uses StarPU to run on heterogeneous architectures includes MAGMA, SkePU and PaStiX.


A runtime for scheduling irregular fine grain tasks with data flow dependencies. It could be used through OpenMP-4.0 compliant applications using GNU C or C++ compiler, Intel compilers or our the research C/C++ source-to-source compiler KSTAR. It is a C library that allows to execute multithreaded computation with data flow synchronization between threads. The library is able to schedule fine/medium size grain program on distributed machine. The data flow graph is dynamic (unfold at runtime). Target architectures are clusters of SMP machines.


LabPlot is an application for interactive graphing and analysis of scientific data.


The Linear Algebra Package is a standard software library for numerical linear algebra. It provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It also includes routines to implement the associated matrix factorizations such as LU, QR, Cholesky and Schur decomposition. LAPACK was originally written in FORTRAN 77, but moved to Fortran 90 in version 3.2 (2008).[1] The routines handle both real and complex matrices in both single and double precision.

LAPACK was designed to effectively exploit the caches on modern cache-based architectures and can run very fast given a well-tuned BLAS implementation. The routines are written so that as much as possible of the computation is performed by calls to the Basic Linear Algebra Subprograms (BLAS). LAPACK is designed at the outset to exploit the Level 3 BLAS — a set of specifications for Fortran subprograms that do various types of matrix multiplication and the solution of triangular systems with multiple right-hand sides. Because of the coarse granularity of the Level 3 BLAS operations, their use promotes high efficiency on many high-performance computers, particularly if specially coded implementations are provided by the manufacturer. LAPACK has also been extended to run on distributed-memory systems in later packages such as ScaLAPACK and PLAPACK.


A modern open-source JavaScript library for mobile-friendly interactive maps. Leaflet is designed with simplicity, performance and usability in mind. It works efficiently across all major desktop and mobile platforms out of the box, taking advantage of HTML5 and CSS3 on modern browsers while still being accessible on older ones. It can be extended with a huge amount of plugins.

Basics of Making Maps with Leaflet and Browswerify -


Folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js library. Manipulate your data in Python, then visualize it in on a Leaflet map via Folium.

Folium makes it easy to visualize data that’s been manipulated in Python on an interactive Leaflet map. It enables both the binding of data to a map for choropleth visualizations as well as passing Vincent/Vega visualizations as markers on the map.

The library has a number of built-in tilesets from OpenStreetMap, Mapbox, and Stamen, and supports custom tilesets with Mapbox or Cloudmade API keys. Folium supports both GeoJSON and TopoJSON overlays, as well as the binding of data to those overlays to create choropleth maps with color-brewer color schemes.


A free and open-source library of codecs for encoding and decoding video and audio data.[5] Because of a project fork, libraries with this name are provided by FFmpeg and libav, but they are incompatible.

This is an integral part of many open-source multimedia applications and frameworks. The popular MPlayer, xine and VLC media players use it as their main, built-in decoding engine that enables playback of many audio and video formats on all supported platforms. It is also used by the ffdshow tryouts decoder as its primary decoding library. libavcodec is also used in video editing and transcoding applications like Avidemux, MEncoder or Kdenlive for both decoding and encoding.

Libavcodec contains decoder and sometimes encoder implementations of several proprietary formats, including ones for which no public specification has been released. As such, a significant reverse engineering effort is part of libavcodec development. Having such codecs available within the standard libavcodec framework gives a number of benefits over using the original codecs, most notably increased portability, and in some cases also better performance, since libavcodec contains a standard library of highly optimized implementations of common building blocks, such as DCT and color space conversion.


Provides cross-platform tools and libraries to convert, manipulate and stream a wide range of multimedia formats and protocols.


A C++ library of algorithms for representing cloud microphysics in numerical models. Currently (6/15) the library covers three warm-rain schemes: the single- and double-moment bulk schemes, and the particle-based scheme with Monte-Carlo coalescence. This requires Boost and Thrust.


libeemd is a C library for performing the ensemble empirical mode decomposition (EEMD), its complete variant (CEEMDAN) or the regular empirical mode decomposition (EMD). It includes a Python interface called pyeemd. The details of what libeemd actually computes are available as a separate article, which you should read if you are unsure about what EMD, EEMD and CEEMDAN are.


The Open Fabrics Interfaces (OFI) is a framework focused on exporting fabric communication services to applications. Libfabric is a software library instantiation of OFI.

The goal of OFI and libfabric is to define interfaces that enable a tight semantic map between applications and underlying fabric services. Specifically, libfabric software interfaces have been co-designed with fabric hardware providers and application developers, with a focus on the needs of HPC users. OFI supports multiple interface semantics, is fabric and hardware implementation agnostic, and leverages and expands the existing RDMA open source community.

Libfabric is designed to minimize the impedance mismatch between applications, including middleware such as MPI, SHMEM, and PGAS, and fabric communication hardware. Its interfaces target high-bandwidth, low-latency NICs, with a goal to scale to tens of thousands of nodes.


This acts as a highly efficient multi-dimensional array of arbitrary objects, but really uses a struct of arrays memory layout. It’s great for writing vectorized code and its lightning-fast iterators give you access to neighboring elements with zero address generation overhead.


LibGeoDecomp (Library for Geometric Decomposition codes) is an auto-parallelizing library for computer simulations. It is written in C++ and works best with kernels written in C++, but other languages (e.g. Fortran) may be linked in, too. Thanks to its modular design the library can harness all state of the art hardware architectures, e.g. multi-core CPUs, GPUs (currently only NVIDIA GPUs, via CUDA), Intel Xeon Phi, MPI clusters and Raspberry Pi.

The library takes over the spatial and temporal loops of the simulation as well as storage of the simulation data. It will call back the user code for performing the actual computations. User code in turn calls back the brary to access simulation data. Thanks to this two-way callback the library can control which part of the code runs when.

Users can build custom computer simulations (e.g. engineering or natural sciences problems) by encapsulating their model in a C++ class. This class is then supplied to the library as a template parameter. The library essentially relieves the user from the pains of parallel programming, but is limited to applications which perform space- and time-discrete simulations with only local interactions.


A set of tools for accessing and modifying virtual machine (VM) disk images. You can use this for viewing and editing files inside guests, scripting changes to VMs, monitoring disk used/free statistics, creating guests, P2V, V2V, performing backups, cloning VMs, building VMs, formatting disks, resizing disks, and much more.

libguestfs can access almost any disk image imaginable. It can do it securely — without needing root and with multiple layers of defence against rogue disk images. It can access disk images on remote machines or on CDs/USB sticks. It can access proprietary systems like VMware and Hyper-V.

All this functionality is available through a scriptable shell called guestfish, or an interactive rescue shell virt-rescue.

libguestfs is a C library that can be linked with C and C++ management programs and has bindings for about a dozen other programming languages. Using our FUSE module you can also mount guest filesystems on the host.


This package contains miscellaneous system administrator command line tools for virtual machines.


A library of parallel forward-in-time solvers for systems of generalised transport equations. The solvers belong to the Multidimensional Positive Definite Advection Transport Algorithm (MPDATA) family of numerical schemes. This requires Blitz, Boost and HDF5.


The Libra Toolkit is a collection of algorithms for learning and inference with discrete probabilistic models, including Bayesian networks (BNs), Markov networks (MNs), dependency networks (DNs), sum-product networks (SPNs), and arithmetic circuits (ACs). Compared to other toolkits, Libra focuses more on structure learning, especially for tractable models in which exact inference is efficient. Each algorithm in Libra is implemented as a command-line program suitable for interactive use or scripting, with consistent options and file formats throughout the toolkit.

All methods are implemented in OCaml, in order to obtain the best possible speed while keeping the code compact and easy to work with. To compile the source code, you must have a UNIX-like environment (Linux, OS X) with a recent version of OCaml installed.


An extensible, high-performance, cross-platform, open-source software library for the simulation and analysis of models expressed using Systems Biology Markup Language (SBML). SBML is the most widely used standard for representing dynamic networks, especially biochemical networks. libRoadRunner supports solution of both large models and multiple replicas of a single model on desktop, mobile and cluster computers. libRoadRunner is a self-contained library, able to run both as a component inside other tools via its C++ and C bindings andnteractively through its Python interface. libRoadRunner uses a custom Just-In-Time (JIT) compiler built on the widely-used LLVM JIT compiler framework to compile SBML-specified models directly into very fast native machine code for a variety of processors, making it appropriate for solving very large models or multiple replicas of smaller models. libRoadRunner is flexible, supporting the bulk of the SBML specification (except for delay and nonlinear algebraic equations) and several of its extensions. It offers multiple deterministic and stochastic integrators, as well as tools for steady-state, stability analyses and flux balance analysis.


A multi-platform support library with a focus on asynchronous I/O. It was primarily developed for use by Node.js, but it’s also used by Luvit, Julia, pyuv, and others.

An Introduction to libuv -


A Python module which provides an interface to libuv.


Lighthouse is a framework for creating, maintaining, and using a taxonomy of available software that can be used to build highly-optimized matrix algebra computations. The taxonomy provides an organized anthology of software components and programming tools needed for that task. The taxonomy will serve as a guide to practitioners seeking to learn what is available for their programming tasks, how to use it, and how the various parts fit together. It builds upon and improves existing collections of numerical software, adding tools for the tuning of matrix algebra computations.


Limulus is an acronym for LInux MULti-core Unified Supercomputer. The Limulus project goal is to create and maintain an open specification and software stack for a personal workstation cluster. Ideally, a user should be able to build or purchase a small personal workstation cluster using the Limulus reference design and low cost hardware. In addition, a freely available turn-key Linux based software stack will be created and maintained for use on the Limulus design. A Limulus is inteneded to be a workstation cluster platform where users can develop software, test ideas, run small scale applications, and teach HPC methods.


LinBox is a C++ template library for exact, high-performance linear algebra computation with dense, sparse, and structured matrices over the integers and over finite fields. LinBox aims to provide world-class high performance implementations of the most advanced algorithms for exact linear algebra.

LinBox was originally designed primarily to work with sparse and structured matrices, which are defined in this context as those matrices where the computational cost of application of an m by n matrix to a vector is significantly less than O(mn), the cost for a dense matrix. Now, increasingly, LinBox also has codes for dense matrix computations using floating point BLAS routines for speed while not sacrificing exactness.

LinBox implements iterative system-solving methods such as those of Wiedemann and Lanczos to operate on very large, sparse linear systems. This avoids the potential fill-in associated with elimination-based methods and keeps memory use relatively constant through the computation. These methods allow LinBox to be used to solve systems with hundreds of thousands of equations and hundreds of thousands of variables.


LinuxCNC (the Enhanced Machine Control) is a software system for computer control of machine tools such as milling machines and lathes. It provides:

  • several graphical user interfaces including one for touch screens

  • an interpreter for "G-code" (the RS-274 machine tool programming language)

  • a realtime motion planning system with look-ahead

  • operation of low-level machine electronics such as sensors and motor drives

  • an easy to use "breadboard" layer for quickly creating a unique configuration for your machine

  • a software PLC programmable with ladder diagrams

  • easy installation with .deb packages or a Live-CD

It does not provide drawing (CAD - Computer Aided Design) or G-code generation from the drawing (CAM - Computer Automated Manufacturing) functions.

It can simultaneously move up to 9 axes and supports a variety of interfaces. The control can operate true servos (analog or PWM) with the feedback loop closed by the LinuxCNC software at the computer, or open loop with "step-servos" or stepper motors. Motion control features include: cutter radius and length compensation, path deviation limited to a specified tolerance, lathe threading, synchronized axis motion, adaptive feedrate, operator feed override, and constant velocity control. Support for non-Cartesian motion systems is provided via custom kinematics modules. Available architectures include hexapods (Stewart platforms and similar concepts) and systems with rotary joints to provide motion such as PUMA or SCARA robots. LinuxCNC runs on Linux using real time extensions. Support currently exists for version 2.4 and 2.6 Linux kernels with real time extensions applied by RT-Linux or RTAI patches.

Linux Diminutives

Linux distributions that aren’t (necessarily) lesser but rather smaller in size.


CoreOS is a new Linux distribution that has been rearchitected to provide features needed to run modern infrastructure stacks. The strategies and architectures that influence CoreOS allow companies like Google, Facebook and Twitter to run their services at scale with high resilience.


Welcome to OpenEmbedded, the build framework for embedded Linux. OpenEmbedded offers a best-in-class cross-compile environment. It allows developers to create a complete Linux Distribution for embedded systems. The OpenEmbedded-Core Project (OE-Core for short) resulted from the merge of the Yocto Project with OpenEmbedded.


The Yocto Project is an open source collaboration project that provides templates, tools and methods to help you create custom Linux-based systems for embedded products regardless of the hardware architecture. It was founded in 2010 as a collaboration among many hardware manufacturers, open-source operating systems vendors, and electronics companies to bring some order to the chaos of embedded Linux development.

The Yocto Project provides resources and information catering to both new and experienced users, and includes core system component recipes provided by the OpenEmbedded project. The Yocto Project also provides pointers to example code built demonstrating its capabilities. These community-tested images include the Yocto Project kernel and cover several build profiles across multiple architectures including ARM, PPC, MIPS, x86, and x86-64. Specific platform support takes the form of Board Support Package (BSP) layers for which a standard format has been developed. The project also provides an Eclipse IDE plug-in and a graphical user interface to the build system called Hob.


Replicant is a fully free Android distribution running on several devices, a free software mobile operating system putting the emphasis on freedom and privacy/security. It aims to replace all proprietary Android components with their free software counterparts. This also makes it a security focused operating system as it closes discovered Android backdoors.[4] It is available for several smartphones and tablet computers.


An operating system based on the Linux kernel and the GNU C Library implementing the Linux API. It targets a very wide range of devices including smartphones, tablets, in-vehicle infotainment (IVI) devices, smart TVs, PCs, smart cameras, wearable computing (such as smartwatches), Blu-ray players, printers and smart home appliances[3] (such as refrigerators, lighting, washing machines, air conditioners, ovens/microwaves and a robotic vacuum cleaner[4]). Its purpose is to offer a consistent user experience across devices. Tizen is a project within the Linux Foundation and is governed by a Technical Steering Group (TSG) composed of Samsung and Intel among others.

HTML5 applications run on Tizen, Android, Firefox OS, Ubuntu Touch, Windows Phone, and webOS without a browser. Applications based on Qt, GTK+ and EFL frameworks can run on Tizen IVI.[28] While there is no official support for these third-party frameworks, according to the explanation on the Tizen SDK Web site,[29] Tizen applications for mobile devices can be developed without relying on an official Tizen IDE as long as the application complies with Tizen packaging rules. In May 2013, a community port of Qt to Tizen focused on delivering native GUI controls and integration of Qt with Tizen OS features for smartphones.[30] Based on the Qt port to Tizen, Tizen and mer can interchange code.


WebOS, also known as webOS, LG webOS, Open webOS, or HP webOS, is a Linux kernel-based multitask operating system for smart devices like TVs,[1] and smartwatches;[2] and was formerly a mobile operating system.[3] Initially developed by Palm, which was acquired by Hewlett-Packard, HP made the platform open source, and it became Open webOS.

The WebOS mobile platform introduced features so innovative that some are still in use by Apple, Microsoft and Google on their mobile operating systems iOS, Windows Phone, and Android, respectively.

Linux Distributions

More for unusual and/or cute ones than those that are more well known.

Linux From Scratch

Linux From Scratch (LFS) is a project that provides you with step-by-step instructions for building your own custom Linux system, entirely from source code.


The LOCKSS Program, based at Stanford University Libraries, provides libraries and publishers with award-winning, low-cost, open source digital preservation tools to preserve and provide access to persistent and authoritative digital content.

We recommend installing the LOCKSS software on a dedicated server or a virtual machine. We provide the LOCKSS software integrated with a Linux installation based on CentOS 6.


A Linux distribution with a unique approach to package and configuration management. Built on top of the Nix package manager, it is completely declarative, makes upgrading systems reliable, and has many other advantages.


Openwall GNU/*/Linux (or Owl for short) is a small security-enhanced Linux distribution for servers, appliances, and virtual appliances. Owl live CDs with remote SSH access are also good for recovering or installing systems (whether with Owl or not). Another secondary use is for operating systems and/or computer security courses, which benefit from the simple structure of Owl and from our inclusion of the complete build environment.

Linux Initialization

See below.


A highly-available key value store for shared configuration and service discovery. A component of CoreOS.


This ties together systemd and etcd into a distributed init system. Think of it as an extension of systemd that operates at the cluster level instead of the machine level. This project is very low level and is designed as a foundation for higher order orchestration.


In Unix-based computer operating systems, init (short for initialization) is the first process started during booting of the computer system. Init is a daemon process that continues running until the system is shut down. It is the direct or indirect ancestor of all other processes and automatically adopts all orphaned processes. Init is started by the kernel using a hard-coded filename; a kernel panic will occur if the kernel is unable to start it. Init is typically assigned process identifier 1.

The design of init has diverged in Unix systems such as System III and System V, from the functionality provided by the init in Research Unix and its BSD derivatives. The usage on most Linux distributions is somewhat compatible with System V, but some distributions, such as Slackware, use a BSD-style and others, such as Gentoo, have their own customized version.

Several replacement init implementations have been written with attempt to address design limitations in the standard versions. These include launchd, the Service Management Facility, systemd and Upstart.


A suite of basic building blocks for a Linux system. It provides a system and service manager that runs as PID 1 and starts the rest of the system. systemd provides aggressive parallelization capabilities, uses socket and D-Bus activation for starting services, offers on-demand starting of daemons, keeps track of processes using Linux control groups, supports snapshotting and restoring of the system state, maintains mount and automount points and implements an elaborate transactional dependency-based service control logic. systemd supports SysV and LSB init scripts and works as a replacement for sysvinit. Other parts include a logging daemon, utilities to control basic system configuration like the hostname, date, locale, maintain a list of logged-in users and running containers and virtual machines, system accounts, runtime directories and settings, and daemons to manage simple network configuration, network time synchronization, log forwarding, and name resolution.


Upstart is an event-based replacement for the /sbin/init daemon which handles starting of tasks and services during boot, stopping them during shutdown and supervising them while the system is running. It was originally developed for the Ubuntu distribution, but is intended to be suitable for deployment in all Linux distributions as a replacement for the venerable System-V init.

Linux Package Managers


The software at the base of the package management system in the free operating system Debian and its numerous derivatives. dpkg is used to install, remove, and provide information about .deb packages.

dpkg itself is a low level tool; higher level tools, such as APT, are used to fetch packages from remote locations or deal with complex package relations. Tools like aptitude or synaptic are more commonly used than dpkg on its own, as they have a more sophisticated way of dealing with package relationships and a friendlier interface.


Aptitude is an Ncurses based FrontEnd to Apt, the debian package manager. Since it is text based, it is run from a terminal or a CLI (command line interface).

dkpg on Fedora

An RPM file containing dpkg for Fedora distributions.


A purely functional package manager for the GNU system.

Reproducible and User-Controlled Software Environments in HPC with Guix -


Nix is a powerful package manager for Linux and other Unix systems that makes package management reliable and reproducible. It provides atomic upgrades and rollbacks, side-by-side installation of multiple versions of a package, multi-user package management and easy setup of build environments.


The Nix Packages collection (Nixpkgs) is a set of nearly 6,500 packages for the Nix package manager, released under a permissive MIT/X11 license. On GNU/Linux, the packages in Nixpkgs are ‘pure’, meaning that they have no dependencies on packages outside of the Nix store. This means that they should work on pretty much any GNU/Linux distribution.


OpenPKG provides a flexible and extensive toolkit of about 1500 portable and high-quality Unix server software packages within a fully self-contained packaging framework. OpenPKG 4 supports all major Unix server platforms, including BSD, GNU/Linux, Solaris and MacOS X flavors, and can be deployed multiple times on a single system without virtualization technologies and with minimum intrusion. The OpenPKG software distribution is updated daily and hence always provides you with the latest Open Source server software.


The RPM Package Manager (RPM) is a powerful command line driven package management system capable of installing, uninstalling, verifying, querying, and updating computer software packages. Each software package consists of an archive of files along with information about the package like its version, a description, and the like. There is also a library API, permitting advanced developers to manage such transactions from programming languages such as C or Python.



Antik provides a foundation for scientific and engineering computation in Common Lisp. It is designed not only to facilitate numerical computations, but to permit the use of numerical computation libraries and the interchange of data and procedures, whether foreign (non-lisp) or Lisp libraries. It is named after the Antikythera mechanism, one of the oldest examples of a scientific computer known.


CL21 is an experimental project redesigning Common Lisp.


FEMLISP is a Common Lisp framework for solving partial differential equations with the help of the finite element method (FEM).


In a nutshell, Hy is a Lisp dialect, but one that converts its structure into Python …​ literally a conversion into Python’s abstract syntax tree! (Or to put it in more crude terms, Hy is lisp-stick on a Python!)

This is pretty cool because it means Hy is several things:

  • A Lisp that feels very Pythonic

  • For Lispers, a great way to use Lisp’s crazy powers but in the wide world of Python’s libraries (why yes, you now can write a Django application in Lisp!)

  • For Pythonistas, a great way to start exploring Lisp, from the comfort of Python!

  • For everyone: a pleasant language that has a lot of neat ideas!


Mal is an Clojure inspired Lisp interpreter.

Mal is implemented in 26 different languages.

Mal is a learning tool. Each implementation of mal is separated into 11 incremental, self-contained (and testable) steps that demonstrate core concepts of Lisp. The last step is capable of self-hosting (running the mal implemenation of mal).


Parenscript is a translator from an extended subset of Common Lisp to JavaScript. Parenscript code can run almost identically on both the browser (as JavaScript) and server (as Common Lisp).

Parenscript code is treated the same way as Common Lisp code, making the full power of Lisp macros available for JavaScript. This provides a web development environment that is unmatched in its ability to reduce code duplication and provide advanced metaprogramming facilities to web developers.


A Parenscript to Javascript command line compiler and REPL.


Quicklisp makes it easy to get started with a rich set of community-developed Common Lisp libraries. Quicklisp is a library manager for Common Lisp. It works with your existing Common Lisp implementation to download, install, and load any of over 1,100 libraries with a few simple commands. Quicklisp is easy to install and works with ABCL, Allegro CL, Clozure CL, CLISP, CMUCL, ECL, LispWorks, SBCL, and Scieneer CL, on Linux.


Livingstone2 is a reusable artificial intelligence (AI) software system designed to assist spacecraft, life support systems, chemical plants or other complex systems in operating robustly with minimal human supervision, even in the face of hardware failures or unexpected events. Livingstone2 diagnoses the current state of the spacecraft or other system and recommends commands or repair actions that will allow the system to continue operations.

Livingstone2 is an enhancement and re-engineering of the Livingstone diagnosis system that was flight tested on-board the Deep Space One spacecraft in May 1999. It contains significant enhancements to robustness, performance and usability. Livingstone2 is able to track multiple diagnostic hypotheses, as opposed to a single hypothesis in Livingstone. It is also able to revise diagnostic decisions made in the past when additional observations become available. In such cases, Livingstone might find the incorrect hypothesis. These improvements increase robustness.

Re-architecting and re-implementing the system in C++ has increased performance. Usability has been vastly improved by creating a set of development tools which are closely integrated with the Livingstone2 engine. In addition to the core diagnosis engine, Livingstone2 now includes a compiler than translates diagnostic models written in a Java-like language into Livingstone2’s language, and a broad set of graphical tools for model development. These software tools support the rapid deployment of model-based representations of complex systems for Livingstone2 via a visual model builder/tester (Stanley), and two graphical user interface tools (Candidate Manager and History Table) which provide Livingstone2 status information during testing. Runtime support is provided by the real-time interface (RTI) which converts analog sensor readings to the digital values required by Livingstone2.

Also included in the Livingstone2 download is Oliver, a prototype model builder/tester, which is however incomplete, but could be used as a starting place for a new model builder/tester.


The LLVM compiler infrastructure project (formerly Low Level Virtual Machine) is a compiler infrastructure designed as a set of reusable libraries with well-defined interfaces. It is written in C and is designed for compile-time, link-time, run-time, and "idle-time" optimization of programs written in arbitrary programming languages. Originally implemented for C and C, the language-agnostic design (and the success) of LLVM has since spawned a wide variety of front ends: languages with compilers that use LLVM include Common Lisp, ActionScript, Ada, D, Fortran, OpenGL Shading Language, Go, Haskell, Java bytecode, Julia, Objective-C, Swift, Python, Ruby, Rust, Scala, C# and Lua.

LLMV Weekly Newsletter -

LLVM for Grad Students -


This project aims to fully build the Linux kernel using Clang which is the C front end for the LLVM compiler infrastructure project. Together Clang and LLVM have many positive attributes and features which many developers and system integrators would like to take advantage of when developing and deploying the Linux Kernel as a part of their own projects.


Pure is a modern-style functional programming language based on term rewriting. It offers equational definitions with pattern matching, full symbolic rewriting capabilities, dynamic typing, eager and lazy evaluation, lexical closures, built-in list and matrix support and an easy-to-use C interface. The interpreter uses LLVM as a backend to JIT-compile Pure programs to fast native code.


Implementation of the LLVM tutorial in Python.


Souper is a superoptimizer for LLVM IR. It uses an SMT solver to help identify missing peephole optimizations in LLVM’s midend optimizers.

A large amount of numerically-oriented code is written and is being written in legacy languages. Much of this code could, in principle, make good use of data-parallel throughput-oriented computer architectures., a transformation-based programming system targeted at GPUs and general data-parallel architectures, provides a mechanism for user-controlled transformation of array programs. This transformation capability is designed to not just apply to programs written specifically for, but also those imported from other languages such as Fortran. It eases the trade-off between achieving high performance, portability, and programmability by allowing the user to apply a large and growing family of transformations to an input program. These transformations are expressed in and used from Python and may be applied from a variety of settings, including a pragma-like manner from other languages.


An acronym made from the names of the five different models that have been coupled to built the Earth system model: LOch–Vecode-Ecbilt-CLio-agIsm Model (LOVECLIM).


LROSE is an NSF-backed project to develop common software for the LIDAR, RADAR and PROFILER community.


The Larval TRANSport Lagrangian model (LTRANS v.2b) is an off-line particle-tracking model that runs with the stored predictions of a 3D hydrodynamic model, specifically the Regional Ocean Modeling System (ROMS). Although LTRANS was built to simulate oyster larvae, it can easily be adapted to simulate passive particles and other planktonic organisms. LTRANS v.2 is written in Fortran 90 and is designed to track the trajectories of particles in three dimensions. It includes a 4th order Runge-Kutta scheme for particle advection and a random displacement model for vertical turbulent particle motion. Reflective boundary conditions, larval behavior, and settlement routines are also included.


A powerful, fast, lightweight, embeddable scripting language. Lua combines simple procedural syntax with powerful data description constructs based on associative arrays and extensible semantics. Lua is dynamically typed, runs by interpreting bytecode for a register-based virtual machine, and has automatic memory management with incremental garbage collection, making it ideal for configuration, scripting, and rapid prototyping.


A Just-In-Time Compiler (JIT) for the Lua programming language. Lua is a powerful, dynamic and light-weight programming language. It may be embedded or used as a general-purpose, stand-alone language. LuaJIT has been successfully used as a scripting middleware in games, appliances, network and graphics apps, numerical simulations, trading platforms and many other specialty applications. It scales from embedded devices, smartphones, desktops up to server farms. It combines high flexibility with high performance and an unmatched low memory footprint.


LuxMark is a OpenCL benchmark tool. The idea for the program was conceived in 2009 by Jean-Francois Romang. It was intended as a promotional tool for LuxRender (to quote original Jromang’s words: "LuxRender propaganda with OpenCL"). The idea was quite simple, wrap SLG inside an easy to use graphical user interface and use it as a benchmark for OpenCL.


The Linux Virtual Server is a highly scalable and highly available server built on a cluster of real servers, with the load balancer running on the Linux operating system. The architecture of the server cluster is fully transparent to end users, and the users interact as if it were a single high-performance virtual server.

The real servers and the load balancers may be interconnected by either high-speed LAN or by geographically dispersed WAN. The load balancers can dispatch requests to the different servers and make parallel services of the cluster to appear as a virtual service on a single IP address, and request dispatching can use IP load balancing technolgies or application-level load balancing technologies. Scalability of the system is achieved by transparently adding or removing nodes in the cluster. High availability is provided by detecting node or daemon failures and reconfiguring the system appropriately.


The Multiresolution Adaptive Numerical Environment for Scientific Simulation provides a high-level environment for the solution of integral and differential equations in many dimensions using adaptive, fast methods with guaranteed precision based on multi-resolution analysis and novel separated representations. There are three main components to MADNESS. At the lowest level is a new petascale parallel programming environment that increases programmer productivity and code performance/scalability while maintaining backward compatibility with current programming tools such as MPI and Global Arrays. The numerical capabilities built upon the parallel tools provide a high-level environment for composing and solving numerical problems in many (1-6+) dimensions. Finally, built upon the numerical tools are new applications with initial focus upon chemistry, atomic and molecular physics, material science, and nuclear structure.


Matrix algebra on GPU and multicore architectures. The MAGMA project aims to develop a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures, starting with current Multicore+GPU systems. MAGMA provides implementations for CUDA, Intel Xeon Phi, and OpenCL.

It is designed to be similar to LAPACK in functionality, data storage, and interface, and to easily allow scientists to easily port their existing software components from LAPACK to MAGMA. There are two types of LAPACK-style interfaces. The first one, referred to as the CPU interface, takes the input and produces the result in the CPU’s memory. The second, referred to as the GPU interface, takes the input and produces the result in the GPU’s memory. In both cases, a hybrid CPU/GPU algorithm is used. Also included is MAGMA BLAS, a complementary to CUBLAS routines.


clMAGMA is an OpenCL port of MAGMA. It supports AMD GPUs. The clMAGMA library dependancies, in particular optimized GPU OpenCL BLAS and CPU optimized BLAS and LAPACK for AMD hardware, can be found in the AMD clMath Libraries (formerly APPML).

make & variants

A useful albeit confusing tool on Linux systems.

Some alternatives to GNU classic make are listed below.


Bazel is a build tool that builds code quickly and reliably. It is used to build the majority of Google’s software, and thus it has been designed to handle build problems present in Google’s development environment.


Ekam ("make" backwards) is a build system which automatically figures out what to build and how to build it purely based on the source code. No separate "makefile" is needed.

Ekam works by exploration. For example, when it encounters a file ending in ".cpp", it tries to compile the file, intercepting system calls to find out its dependencies (e.g. included headers). If some of these are missing, Ekam continues to explore until it finds headers matching them. When Ekam builds an object file and discovers that it contains a "main" symbol, it tries to link it, searching for other object files to satisfy all symbol references therein. When Ekam sees a test, it runs the test, again intercepting system calls to dynamically discover what inputs the test needs (which may not have been built yet). Ekam can be extended to understand new file types by writing simple shell scripts telling it what to do with them.

Thus Ekam is, in fact, Make in reverse. Make starts by reading a Makefile, sees what executables it wants to build, then from there figures out what source files need to be compiled to link into them, then compiles them. Ekam starts by looking for source files to compile, then determines what executables it can build from them, and, in the end, might output a Makefile describing what it did.


A feature rich set of Autotools configuration files and scripts targetting C/C++ projects, with built-in Doxygen and Google Testing Framework. It works only on UNIX systems and casual build does not depend on anything but Shell interpreter and GNU make. SEAPT is Eclipse CDT friendly.

SEAPT is designed entirely on the basis of Autotools stack: aclocal, autoconf, autoheader, automake and libtool. It is much similar to "classic" open source build system (e.g., needs "configure" to be run), but extends it with a Doxygen documentation and Google Test Framework support out of the box, as well as supplies the necessary Eclipse CDT Autotools Project files. SEAPT suggests a development workflow in which Eclipse is used to write the code and the actual build process is completely independent of any IDE. Though Eclipse is not required at all, it enables deep integration with Autotools and is recommended.

Like cmake, SEAPT provides a nicely formatted colored output. It supports building subdirectories in parallel, compiler version checking and much more.


Mantevo is a multi-faceted application performance project. It provides application performance proxies known as miniapps. Miniapps combine some or all of the dominant numerical kernels contained in an actual stand-alone application. Miniapps include libraries wrapped in a test driver providing representative inputs. They may also be hard-coded to solve a particular test case so as to simplify the need for parsing input files and mesh descriptions. Mini apps range in scale from partial, performance-coupled components of the application to a simplified representation of a complete execution path through the application.


TeaLeaf is a Mantevo mini-app that solves the linear heat conduction equation on a spatially decomposed regularly grid using a 5 point stencil with implicit solvers. TeaLeaf currently solves the equations in two dimensions, but three dimensional support is in beta.

The solvers have been written in Fortran with OpenMP and MPI and they have also been ported to OpenCL to provide an accelerated capability. Other versions invoke third party linear solvers and currently include Petsc, Trilinos and Hypre, which are in beta release. For each of these version there are instructions on how to download, build and link in the relevant library.

Mapbox Studio

Mapbox Studio gives you instant streaming access to massive global datasets like Mapbox Streets, Mapbox Terrain, and Mapbox Satellite without importing any data onto your computer.

Create your own vector tiles using Mapbox Studio. Convert data from traditional formats (Shapefile, GeoJSON, KML, GPX) and upload directly to Mapbox to deploy your vector tiles at scale.


A flexible and complete framework for building rich web-mapping applications. It emphasizes high productivity, and high-quality development. MapFish is based on the Pylons Python web framework. MapFish extends Pylons with geospatial-specific functionality. For example MapFish provides specific tools for creating web services that allows querying and editing geographic objects.

MapFish also provides a complete RIA-oriented JavaScript toolbox, a JavaScript testing environment, and tools for compressing JavaScript code. The JavaScript toolbox is composed of the ExtJS, OpenLayers , GeoExt JavaScript toolkits.

MapFish is compliant with the Open Geospatial Consortium standards. This is achieved through OpenLayers or GeoExt supporting several OGC norms, like WMS, WFS, WMC, KML, GML etc..


The GeoMapFish application allows to build rich and extensible WebGIS in an easy and flexible way. It is composed of a desktop WebGIS interface, an administration interface, an API for map integration in thirdparty websites and a mobile version. Besides the OGC-Standard web services, a MapFish protocol adapted to the efficient communication between Client and Server is available. On this basis, complex and high performance web mapping applications can be built.

MapFish Print

MapFish Print allows printing maps as PDFs. It is written in Java and typically executed as a servlet in a servlet container such as Apache Tomcat.


Mapnik is a Free Toolkit for developing mapping applications. It’s written in C++ and there are Python bindings to facilitate fast-paced agile development. It can comfortably be used for both desktop and web development, which was something I wanted from the beginning.

Mapnik is about making beautiful maps. It uses the AGG library and offers world class anti-aliasing rendering with subpixel accuracy for geographic data. It is written from scratch in modern C++ and doesn’t suffer from design decisions made a decade ago. When it comes to handling common software tasks such as memory management, filesystem access, regular expressions, parsing and so on, Mapnik doesn’t re-invent the wheel, but utilizes best of breed industry standard libraries from

Mapnik uses a plugin architecture to read different datasources. Current plugins can read ESRI shapefiles, PostGIS, TIFF raster, OSM xml, Kismet, as well as all OGR/GDAL formats.

See OGCServer.


An Open Source platform for publishing spatial data and interactive mapping applications to the web. MapServer is an Open Source geographic data rendering engine written in C. Beyond browsing GIS data, MapServer allows you create “geographic image maps”, that is, maps that can direct users to content.


EOxServer is a Python application and framework for presenting Earth Observation (EO) data and metadata. It implements the OGC Implementation Specifications EO-WCS and EO-WMS on top of MapServer’s WCS and WMS implementations.

EOxServer is an open source software for registering, processing, and publishing Earth Observation (EO) data via different Web Services. EOxServer is written in Python and relies on widely-used libraries for geospatial data manipulation.

The core concept of the EOxServer data model is the one of a coverage. In this context, a coverage is a mapping from a domain set (a geographic region of the Earth described by its coordinates) to a range set. For original EO data, the range set usually consists of measurements of some physical quantity (e.g. radiation for optical instruments).


A server that implements tile caching to speed up access to WMS layers. The primary objectives are to be fast and easily deployable, while offering the essential features (and more!) expected from a tile caching solution. MapCache is part of MapServer Suite, but provided as a distinct module.


A lightweight and fast implementation of the OGC WFS-T specification. Web Feature Service (WFS) allows to query and to retrieve features. The transactional profile (WFS-T) allows then to insert, update or delete such features. From a technical point of view WFS-T is a Web Service API in front of a spatial database. TinyOWS is part of MapServer Suite, but provided as a distinct module.


A text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML). Markdown is two things: (1) a plain text formatting syntax; and (2) a software tool, written in Perl, that converts the plain text formatting to HTML.

awesome-markdown: A Collection of Markdown Goodies -


A strongly specified, highly compatible implementation of Markdown.


MultiMarkdown, or MMD, is a tool to help turn minimally marked-up plain text into well formatted documents, including HTML, PDF (by way of LaTeX), OPML, or OpenDocument (specifically, Flat OpenDocument or .fodt, which can in turn be converted into RTF, Microsoft Word, or virtually any other word-processing format).

MMD is a superset of the Markdown syntax, originally created by John Gruber. It adds multiple syntax features (tables, footnotes, and citations, to name a few), in addition to the various output formats listed above (Markdown only creates HTML). Additionally, it builds in “smart” typography for various languages (proper left- and right-sided quotes, for example).


MathFu is a C++ math library developed primarily for games focused on simplicity and efficiency.

It provides a suite of vector, matrix and quaternion classes to perform basic geometry suitable for game developers. This functionality can be used to construct geometry for graphics libraries like OpenGL or perform calculations for animation or physics systems.

Matlab/Octave Translation


A package for converting Matlab or Octave to C++.



A Matlab to Python (Scipy/Numpy) translator.


Translates Matlab to Python.

Matlab Toolboxes


Bayesian Compressive Sensing (BCS) is a Bayesian framework for solving the inverse problem of compressive sensing (CS). The basic BCS algorithm adopts the relevance vector machine (RVM), and later it is extended by marginalizing the noise variance with improved robustness. This is a MatLab 7.0 implementation of BCS, VB-BCS (BCS implemented via a variational Bayesian (VB) approach), TS-BCS for wavelet and for block-DCT implemented via both MCMC approach and VB approach.


A a collection of Matlab functions that have been used by the authors and collaborators to implement a variety of computational algorithms related to beamlet, curvelet, ridgelet analysis. It includes about 900 Matlab files, datasets, and demonstration scripts. Some computationally expensive routines have been implemented as Matlab MEX functions.


The Curvelet transform is a higher dimensional generalization of the Wavelet transform designed to represent images at different scales and different angles. Curvelets enjoy two unique mathematical properties, namely:

  • Curved singularities can be well approximated with very few coefficients and in a non-adaptive manner - hence the name "curvelets."

  • Curvelets remain coherent waveforms under the action of the wave equation in a smooth medium.

By releasing the CurveLab toolbox, we hope to encourage the dissemination of curvelets to image processing, inverse problems and scientific computing.


A Matlab implementation of the Sparsity-Promoting Dynamic Mode Decomposition (DMDSP) algorithm. Dynamic Mode Decomposition (DMD) is an effective means for capturing the essential features of numerically or experimentally generated snapshots, and its sparsity-promoting variant DMDSP achieves a desirable tradeoff between the quality of approximation (in the least-squares sense) and the number of modes that are used to approximate available data. Sparsity is induced by augmenting the least-squares deviation between the matrix of snapshots and the linear combination of DMD modes with an additional term that penalizes the ell_1-norm of the vector of DMD amplitudes. We employ alternating direction method of multipliers (ADMM) to solve the resulting convex optimization problem and to efficiently compute the globally optimal solution.


A Matlab Package that implements fast spherical needlet transforms and fast spherical needlet evaluations.


Radial basis function (RBF) methods are advantageous for a wide-range of applications from analyzing/synthesizing "scattered" data (scalar and vector valued quantities) to numerically solving partial differential equations on geometrically difficult domains. This is a Matlab package for working with RBFs.


A collection of MATLAB routines for the Spherical Harmonic Transform and related manipulations in the spherical harmonic spectrum.


Maven is a build automation tool used primarily for Java projects. The word maven means accumulator of knowledge in Yiddish.[3] Maven addresses two aspects of building software: First, it describes how software is built, and second, it describes its dependencies. Contrary to preceding tools like Apache Ant, it uses conventions for the build procedure, and only exceptions need to be written down. An XML file describes the software project being built, its dependencies on other external modules and components, the build order, directories, and required plug-ins. It comes with pre-defined targets for performing certain well-defined tasks such as compilation of code and its packaging. Maven dynamically downloads Java libraries and Maven plug-ins from one or more repositories such as the Maven 2 Central Repository, and stores them in a local cache.[4] This local cache of downloaded artifacts can also be updated with artifacts created by local projects. Public repositories can also be updated.


MBDyn is the first and possibly the only free* general purpose Multibody Dynamics analysis software. It features the integrated multidisciplinary simulation of multibody, multiphysics systems, including nonlinear mechanics of rigid and flexible bodies (geometrically exact & composite-ready beam and shell finite elements, component mode synthesis elements, lumped elements) subjected to kinematic constraints, along with smart materials, electric networks, active control, hydraulic networks, and essential fixed-wing and rotorcraft aerodynamics.

MBDyn simulates the behavior of heterogeneous mechanical, aeroservoelastic systems based on first principles equations. It can be easily coupled to external solvers for co-simulation of multiphysics problems, e.g. Computational Fluid Dynamics (CFD), terradynamics, block-diagram solvers like Scicos, Scicoslab and Simulink, using a simple C, C++ or Python peer-side API.

MBDyn is being actively developed and used in the aerospace (aircraft, helicopters, tiltrotors, spacecraft), wind energy (wind turbines), automotive (cars, trucks) and mechatronic fields (industrial robots, parallel robots, micro aerial vehicles (MAV)) for the analysis and simulation of the dynamics of complex systems.


Morse decompositions for piecewise constant vector fields.

media center


A free and open source (GPL) software media center for playing videos, music, pictures, games, and more. Kodi runs on Linux, OS X, Windows, iOS, and Android, featuring a 10-foot user interface for use with televisions and remote controls. It allows users to play and view most videos, music, podcasts, and other digital media files from local and network storage media and the internet. This was formerly known as XBMC.


MediaGoblin is a free software media publishing platform that anyone can run. You can think of it as a decentralized alternative to Flickr, YouTube, SoundCloud, etc.


MediaWiki is a free software open source wiki package written in PHP, originally for use on Wikipedia. It is now also used by several other projects of the non-profit Wikimedia Foundation and by many other wikis, including this website, the home of MediaWiki.

MediaWiki is an extremely powerful, scalable software and a feature-rich wiki implementation that uses PHP to process and display data stored in a database, such as MySQL. Pages use MediaWiki’s wikitext format, so that users without knowledge of XHTML or CSS can edit them easily. When a user submits an edit to a page, MediaWiki writes it to the database, but without deleting the previous versions of the page, thus allowing easy reverts in case of vandalism or spamming. MediaWiki can manage image and multimedia files, too, which are stored in the filesystem. For large wikis with lots of users, MediaWiki supports caching and can be easily coupled with Squid proxy server software.


The Collection extension for MediaWiki allows users to collect articles and generate downloadable version in different formats (PDF, OpenDocument Text etc.) for article collections and single articles.


Provides a library for parsing MediaWiki articles and converting them to different output formats. The collection extension is a MediaWiki extensions enabling users to collect articles and generate PDF files from those.


Kiwix is an offline reader for web content. It’s software intended to make Wikipedia available without using the internet, but it is potentially suitable for all HTML content. Kiwix supports the ZIM format, a highly compressed open format with additional meta-data. The features include a full text search engine, bookmarks and notes, an HTTP server, PDF/HTML export, a user interface in more than 10 languages, tabs navigation, and integrated content manager and downloader, etc.

See also OpenZIM.

Semantic MediaWiki

A free, open-source extension to MediaWiki – the wiki software that powers Wikipedia – that lets you store and query data within the wiki’s pages. Semantic MediaWiki is also a full-fledged framework, in conjunction with many spinoff extensions, that can turn a wiki into a powerful and flexible knowledge management system. All data created within SMW can easily be published via the Semantic Web, allowing other systems to use this data seamlessly.


A model designed to be used to predict the transport and weathering of an oil spill, using a lagrangian representation of the oil slick. MEDSLIK-II simulates the transport of the surface slick governed by the water currents and by the wind. Oil particles are also dispersed by turbulent fluctuation components that are parameterized with a random walk scheme. In addition to advective and diffusive displacements, the oil spill particles change due to various physical and chemical processes that transform the oil (evaporation, emulsification, dispersion in water column, adhesion to coast). MEDSLIK-II includes a proper representation of high frequency currents and wind fields in the advective components of the lagrangian trajectory model, the introduction of the Stokes drift velocity and the coupling with the remote-sensing data.


A new innovative Python implementation harnessing Google’s super fast Dart Virtual Machine running Python at near native speeds.


A global optimization software tool that integrates two prominent population-based stochastic algorithms, namely Particle Swarm Optimization and Differential Evolution, with well established efficient local search procedures made available via the Merlin optimization environment. The resulting hybrid algorithms, also referred to as Memetic Algorithms, combine the space exploration advantage of their global part with the efficiency asset of the local search, and as expected they have displayed a highly efficient behavior in solving diverse optimization problems. The proposed software is carefully parametrized so as to offer complete control to fully exploit the algorithmic virtues. It is accompanied by comprehensive examples and a large set of widely used test functions, including tough atomic cluster and protein conformation problems.


Mesos is a distributed systems kernel. It is built using the same principles as the Linux kernel, only at a different level of abstraction. The Mesos kernel runs on every machine and provides applications (e.g., Hadoop, Spark, Kafka, Elastic Search) with API’s for resource management and scheduling across entire datacenter and cloud environments.

Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, MPI, Hypertable, Spark, and other frameworks on a dynamically shared pool of nodes.

metadata systems


A prototype highly performant and distributed metadata server for supplying the need for metadata catalogues in a grid environment such as used in the HEP community. AMGA as a metadata service allows users to attach metadata information to files stored on the Grid, where metadata can be any relationally organized data typically stored in a relational database system (RDBMS). In addition, the metadata in AMGA can also be stored independently of any associated files, which allows AMGA to be used as a general access tool to relational databases on the Grid. AMGA features a simple to learn metadata access language, which has been very useful for the adoption of AMGA in smaller Grid applications, as it considerably lowers the technical hurdle to make use of relational data.

One of the main features of AMGA, and one unique to it, is the possibility to replicate metadata between different AMGA instances allowing the federation of metadata, but also to increase the scalability and improve the access times on a globally deployed Grid. Performance and efficiency of the access across WANs has been independently targeted by an access protocol optimised for the bulk transfer of metadata across WANs using data streaming.

The AMGA implementation uses streaming to communicate between client and server which shows a very promising performance. To meet the EGEE requirements, we have also implemented an alternative SOAP-based frontend. The package includes:

  • a streaming and a SOAP front-end;

  • an interactive client for the streaming front-end; and

  • client APIs for the streaming client (C++, Java and Python).


MeteoIO can be seen as a set of modules that is focused on the handling of input/output operations (including data preparation) for numerical simulations in the realm of earth sciences. On the visible side, it offers the following modules, working on a pre-determined set of meteorological parameters or on parameters added by the developer:

  • a set of plugins for accessing the data (for example, a plugin might be responsible for fetching the raw data from a given database)

  • a set of filters and processing elements for applying transformations to the data (for example, a filter might remove all data that is out of range)

  • a set of resampling algorithms to temporally interpolate the data at the required timestamp

  • a set of parametrizations to generate data/meteorological parameters when they could not be interpolated

  • a set of spatial interpolation algorithms (for example, such an algorithm might perform Inverse Distance Weighting for filling a grid with spatially interpolated data)

Each of these steps can be configured and fine tuned according to the needs of the model and the wishes of the user.


Meteor is an ultra-simple environment for building modern websites. What once took weeks, even with the best tools, now takes hours with Meteor.

The web was originally designed to work in the same way that mainframes worked in the 70s. The application server rendered a screen and sent it over the network to a dumb terminal. Whenever the user did anything, that server rerendered a whole new screen. This model served the Web well for over a decade. It gave rise to LAMP, Rails, Django, PHP.

But the best teams, with the biggest budgets and the longest schedules, now build applications in JavaScript that run on the client. These apps have stellar interfaces. They don’t reload pages. They are reactive: changes from any client immediately appear on everyone’s screen.

They’ve built them the hard way. Meteor makes it an order of magnitude simpler, and a lot more fun. You can build a complete application in a weekend, or a sufficiently caffeinated hackathon. No longer do you need to provision server resources, or deploy API endpoints in the cloud, or manage a database, or wrangle an ORM layer, or swap back and forth between JavaScript and Ruby, or broadcast data invalidations to clients.


The simulation and parameter optimization of coupled ocean circulation and ecosystem models in three space dimensions is one of the most challenging tasks in numerical climate research. Here we present a scientific toolkit that aims at supporting researchers by defining clear coupling interfaces, providing state-of-the-art numerical methods for simulation, parallelization and optimization while using only freely available and (to a great extend) platform-independent software. Besides defining a user-friendly coupling interface (API) for marine ecosystem or biogeochemical models, we heavily rely on the Portable, Extensible Toolkit for Scientific computation (PETSc) developed at Argonne Nat. Lab. for a wide variety of parallel linear and non-linear solvers and optimizers. We specifically focus on the usage of matrix-free Newton-Krylov methods for the fast computation of steady periodic solutions, and make use of the Transport Matrix Method (TMM).


A Lisp implemented in < 1 KB of JavaScript with macros, TCO, interop and exception handling.


Software for the analysis of global dynamical fields in (re)analyses, weather forecasts and climate models. MODES enables the diagnosis of properties of balanced and inertio-gravity (IG) circulations across many scales. In particular, the IG spectrum, which has only recently become observable, can be studied simultaneously in the mass and wind fields while considering the whole model depth in contrast to the majority of studies.


A parallelized Python library for finding modal decompositions and reduced-order models. Parallel implementations of the proper orthogonal decomposition (POD), balanced POD (BPOD), dynamic mode decomposition (DMD), and Petrov-Galerkin projection are provided, as well as serial implementations of the Observer Kalman filter Identification method (OKID) and the Eigensystem Realization Algorithm (ERA). Modred is applicable to a wide range of problems and nearly any type of data.

module systems


A Lua-based module system that easily handles the MODULEPATH Hierarchical problem. Environment Modules provide a convenient way to dynamically change the users' environment through modulefiles. This includes easily adding or removing directories to the PATH environment variable. Modulefiles for Library packages provide environment variables that specify where the library and header files can be found.


A next generation web framework for the Perl programming language.


Mondrian is a general purpose statistical data-visualization system. It features outstanding interactive visualization techniques for data of almost any kind, and has particular strengths, compared to other tools, for working with Categorical Data, Geographical Data and LARGE Data.

All plots in Mondrian are fully linked, and offer many interactions and queries. Any case selected in a plot in Mondrian is highlighted in all other plots.

Currently implemented plots comprise Histograms, Boxplots y by x, Scatterplots, Barcharts, Mosaicplots, Missing Value Plots, Parallel Coordinates/Boxplots, SPLOMs and Maps.

Mondrian works with data in standard tab-delimited or comma-separated ASCII files and can load data from R workspaces. There is basic support for working directly on data in Databases (please email for further info).

Mondrian is written in JAVA and is distributed as a native application (wrapper) for MacOS X and Windows. Linux users need to start the jar-file.


MongoDB (from humongous) is a cross-platform document-oriented database. Classified as a NoSQL database, MongoDB eschews the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas (MongoDB calls the format BSON), making the integration of data in certain types of applications easier and faster.


D4 is an automated tool for a generating distributed document database designs for applications running on MongoDB. This tool specifically targets applications running highly concurrent workloads, and thus its designs are tailored to the unique properties of large-scale, Web-based applications. It can also be used to assist in porting MySQL-based applications to MongoDB.

Using a sample workload trace from a either a document-oriented or relational database application, D4 will compute the best a database design that optimizes the throughput and latency of a document DBMS.


Moodle is a free, online Learning Management system enabling educators to create their own private website filled with dynamic courses that extend learning, any time, anywhere. Whether you’re a teacher, student or administrator, Moodle can meet your needs. Moodle’s extremely customisable core comes with many standard features.


The Multiphysics Object-Oriented Simulation Environment (MOOSE) is a finite-element, multiphysics framework primarily developed by Idaho National Laboratory. It provides a high-level interface to some of the most sophisticated nonlinear solver technology on the planet. MOOSE presents a straightforward API that aligns well with the real-world problems scientists and engineers need to tackle. Every detail about how an engineer interacts with MOOSE has been thought through, from the installation process through running your simulation on state of the art supercomputers, the MOOSE system will accelerate your research.

Continuous Integration for Concurrent Computational Framework and Application Development (online article) -


Mopidy is an extensible music server written in Python.

Mopidy plays music from local disk, Spotify, SoundCloud, Google Play Music, and more. You edit the playlist from any phone, tablet, or computer using a range of MPD and web clients.


MORSE is an generic simulator for academic robotics. It focuses on realistic 3D simulation of small to large environments, indoor or outdoor, with one to tenths of autonomous robots.

MORSE can be entirely controlled from the command-line. Simulation scenes are generated from simple Python scripts.

MORSE comes with a set of standard sensors (cameras, laser scanner, GPS, odometry,…​), actuators (speed controllers, high-level waypoints controllers, generic joint controllers) and robotic bases (quadrotors, ATRV, Pioneer3DX, generic 4 wheel vehicle, PR2,…​). New ones can easily be added.

MORSE rendering is based on the Blender Game Engine. The OpenGL-based Game Engine supports shaders, provides advanced lightning options, supports multi-texturing, and use the state-of-the-art Bullet library for physics simulation.


MoviePy is a Python module for video editing, which can be used for basic operations (like cuts, concatenations, title insertions), video compositing (a.k.a. non-linear editing), video processing, or to create advanced effects. It can read and write the most common video formats, including GIF.


Comparison among OOP versions of an MPDATA code written using Python, Fortran and C++.


The Message Passing Interface…​


BDMPI is a message passing library and associated runtime system for developing out-of-core distributed computing applications for problems whose aggregate memory requirements exceed the amount of memory that is available on the underlying computing cluster. BDMPI is based on the Message Passing Interface (MPI) and provides a subset of MPI’s API along with some extensions that are designed for BDMPI’s memory and execution model.

A BDMPI-based application is a standard memory-scalable parallel MPI program that was developed assuming that the underlying system has enough computational nodes to allow for the in-memory execution of the computations. This program is then executed using a sufficiently large number of processes so that the per-process memory fits within the physical memory available on the underlying computational node(s). BDMPI maps one or more of these processes to the computational nodes by relying on the OS’s virtual memory management to accommodate the aggregate amount of memory required by them. BDMPI prevents memory thrashing by coordinating the execution of these processes using node-level co-operative multi-tasking that limits the number of processes that can be running at any given time. This ensures that the currently running process(es) can establish and retain memory residency and thus achieve efficient execution. BDMPI exploits the natural blocking points that exist in MPI programs to transparently schedule the co-operative execution of the different processes. In addition, BDMPI’s implementation of MPI’s communication operations is done so that to maximize the time over which a process can execute between successive blocking points. This allows it to amortize the cost of loading data from disk over the maximal amount of computations that can be performed.

Since BDMPI is based on the standard MPI library, it also provides a framework that allows the automated out-of-core execution of existing MPI applications. BDMPI is implemented in such a way so that to be a drop-in replacement of existing MPI implementations and allow existing codes that utilize the subset of MPI functions implemented by BDMPI to compile unchanged.


DataMPI is an efficient, flexible, and productive communication library, which provides a set of key-value pair based communication interfaces that extends MPI for Big Data. Through utilizing the efficient communication technologies in the High-Performance Computing area, DataMPI can speedup the emerging data intensive computing applications. DataMPI takes a step in bridging the two fields of HPC and Big Data.

DataMPI can support multiple modes for various Big Data Computing applications, including Common, MapReduce, Streaming, and Iteration. The current version implements the functionalities and features of the Common mode, which aims to support the single program, multiple data (SPMD) applications. The remaining modes will be released in the future.

The current implementation of DataMPI is extending mpiJava. We also integrate some features from Hadoop under Apache License 2.0. The current evaluations of DataMPI use MVAPICH2 as the backend. DataMPI also supports other MPI implementations, such as MPICH2.


An implementation of the MPI (Message Passing Interface) standard designed for high performance computing in the Grid. It establishes a synthesized cluster computer by binding multiple cluster computers distributed geographically. Users are able to seamlessly deploy their application programs from a local system to the Grid environment for processing a very large data set, which is too large to run on a single system.

GridMPI targets to make global communications efficient, by optimizing the behavior of protocols over the links with non-uniform latency and bandwidth, and also to hide the details of the lower-level network geometry from users. The GridMPI project is working on to provide variations of collective communication algorithms and an abstract layer to hide network geometry, and an interface to the TCP/IP communication layer to make it adaptive to those algorithms.

PSPacer is a precise software pacer of IP traffic for Linux, which controls IP traffic to regulate bandwidth and smooth bursty traffic. It is implemented as a Linux loadable kernel module, but it controls the traffic at a very high precision (less than a micro-second!) which was only possible by using a special hardware. PSPacer is a standalone module and not bound to GridMPI. Its application varies widely, for example, high-bandwidth TCP/IP streaming, and traffic control of low-bandwidth lines.


MPICH-Madeleine is a free MPICH-based implementation of the MPI standard, which is a high-level communication interface designed to provide high performance communications on various network architectures including supercomputers and clusters of workstations (usually off-the-shelf PC/s interconnected by high speed links). Nowadays, clusters of workstations become increasingly popular thanks to the availability of many high speed connection technologies (Gigabit-Ethernet, Myrinet, GigaNet, SCI). Furthermore, interconnecting such COW/s to build heterogeneous clusters of clusters is now a hot issue. Unfortunately, no current MPI implementation supports this kind of architectures efficiently. Indeed, the only way to handle network heterogeneity is to use interoperable implementations of MPI: several MPI implementations (one per cluster) communicate with each other using an inter-MPI glue.

Our alternate proposal is to provide a true multi-protocol implementation of MPI on top of a generic and multi-protocol communication layer called Madeleine (version 3). Madeleine III is the communication sub-system of the Parallel Multithreaded Machine runtime environment.

This project is deprecated in favor of NewMadeleine.


A communication library for message passing across wide area networks. MPWide has been designed to connect application running on distributed (super)computing resources, and to maximize the communication performance on wide area networks for those without administrative privileges. It can be used to provide message-passing between application, move files, and make very fast connections in client-server environments.

The core MPWide functionalities are provided by the MPWide C API, the communication codebase, and the Socket class. The Socket class is used to manage and use individual tcp connections, while the role of the communication codebase is to provide the MPWide API functionalities in C, using the Socket class.

MPWide relies on a number of data structures, which are used to make it easier to manage the customized connections between endpoints. The most straightforward way to construct a connection in MPWide is to create a communication path. Each path consists of 1 or more tcpstreams, each of which is used to facilitate actual communications over that path. Using a single tcp stream is sufficient to enable a connection, but in many wide area networks, MPWide will deliver much better performance when multiple streams are used. MPWide supports the presence of multiple paths, and the creation and deletion of paths at runtime.

MPWide comes with a number of parameters which allow users to optimize the performance of individual paths. Aside from varying the number of streams, users can modify the size of data sent and received per low-level communication call (the chunk size), the tcp window size, and limit the throughput for individual streams by adjusting the communication pacing rate. The number of streams will always need to provided by the user when creating a path, but users can choose to have the other parameters automatically tuned by enabling the MPWide autotuner.

A Python interface to MPWide is available.

MPWide: a light-weight library for efficient message passing over wide area networks (online paper) -


NewMadeleine is the fourth incarnation of the Madeleine communication library. The new architecture aims at enabling the use of a much wider range of communication flow optimization techniques. Its design is entirely modular: drivers and optimization strategies are dynamically loadable software components, allowing experimentations with multiple approaches or on multiple issues with regard to processing communication flows.

The optimizing scheduler SchedOpt targets applications with irregular, multi-flow communication schemes such as found in the increasingly common application conglomerates made of multiple programming environments and coupled pieces of code, for instance. SchedOpt itself is easily extensible through the concepts of optimization strategies (what to optimize for, what the optimization goal is) expressed in terms of tactics (how to optimize to reach the optimization goal). Tactics themselves are made of basic communication flows operations such as packet merging or reordering.

The communication library is fully multi-threaded through its close integration with PIOMan. It manages concurrent communication operations from multiple libraries and from multiple threads. Its MPI implementation Mad-MPI fully supports the MPI_THREAD_MULTIPLE multi-threading level. It is available on Infiniband (ibverbs), Myrinet (MX and GM), TCP (sockets) and legacy SCI and Quadrics QsNet-2.

Open MPI

The Open MPI Project is an open source Message Passing Interface implementation that is developed and maintained by a consortium of academic, research, and industry partners. Open MPI is therefore able to combine the expertise, technologies, and resources from all across the High Performance Computing community in order to build the best MPI library available. Open MPI offers advantages for system and software vendors, application developers and computer science researchers.


The Open Resilient Cluster Manager (ORCM) was originally developed as an open-source project (under the Open MPI license) by Cisco Systems, Inc to provide a resilient, 100% uptime run-time environment for enterprise-class routers. Based on the Open Run-Time Environment (ORTE) embedded in Open MPI, the system provided launch and execution support for processes executing within the router itself (e.g., computing routing tables), ensuring that a minimum number of copies of each program were always present.

ORCM (Open Resilient Cluster Manager) is a derivative from Open MPI implementation. It consists of:

  • ORCM: Open Resilient Cluster Manager. Provides the following: Resource Management, Scheduler, Job launcher and Resource Monitoring subsystem.

  • ORTE: The Open Run-Time Environment (support for different back-end run-time systems). Provides the RM messaging interface, RM error management subsystem, RM routing subsystem and RM resource allocation subsystem.

  • OPAL: The Open Portable Access Layer (utility and "glue" code used by ORCM and ORTE). Provides operating system interfaces.


A new algorithm, featuring overlapping domain decompositions, for the parallel construction of Delaunay and Voronoi tessellations is developed. Overlapping allows for the seamless stitching of the partial pieces of the global Delaunay tessellations constructed by individual processors. The algorithm is then modified, by the addition of stereographic projections, to handle the parallel construction of spherical Delaunay and Voronoi tessellations. The algorithms are then embedded into algorithms for the parallel construction of planar and spherical centroidal Voronoi tessellations that require multiple constructions of Delaunay tessellations. This combination of overlapping domain decompositions with stereographic projections provides a unique algorithm for the construction of spherical meshes that can be used in climate simulations.

See also STRIPACK.


The mpld3 project brings together Matplotlib, the popular Python-based graphing library, and D3js, the popular Javascript library for creating interactive data visualizations for the web. The result is a simple API for exporting your matplotlib graphics to HTML code which can be used within the browser, within standard web pages, blogs, or tools such as the IPython notebook.


In a first course to classical mechanics elementary physical processes like elastic two-body collisions, the mass–spring model, or the gravitational two-body problem are discussed in detail. The continuation to many-body systems, however, is deferred to graduate courses although the underlying equations of motion are essentially the same and although there is a strong motivation for high-school students in particular because of the use of particle systems in computer games. The missing link between the simple and the more complex problem is a basic introduction to solve the equations of motion numerically which could be illustrated, however, by means of the Euler method. The many-particle physics simulation package MPPhys offers a platform to experiment with simple particle simulations. The aim is to give a principle idea how to implement many-particle simulations and how simulation and visualization can be combined for interactive visual explorations.


With MPS you can design your own extensible DSLs and start using them right away to build end-user applications. Unique technology of projectional editing allows to overcome the limits of language parsers, and build much richer DSL editors, such as ones with tables and diagrams. Along with the editors, you can write comprehensive generators from your DSL to multiple target languages, be it another MPS DSL, or any of the "base" languages such as Java, C, XML, and other.


MPWide is a light-weight communication library for distributed computing. It is specifically developed to allow message passing over long-distance networks using path-specific optimizations.


Morse-Smale Complex Extraction, Exploration, and Reasoning is a set of tools and libraries for feature extraction and exploration in scalar fields. MSCEER computes a gradient-based abstract representation of a scalar field.


The mscomplex3d consists of two modules for computation and analysis of Morse-Smale complexes on 3d grids. The first is a command line exec named mscomplex3d. The second is a python loadable module named pyms3d. The Morse-Smale complex is a topological data structure that partitions datasets based on the gradients of an input scalar function. See here for a quick introduction on Morse-Smale complexes. This website presents software that computes the Morse-Smale complex of scalar functions defined on 3D Structured Grids and 2D triangle meshes.


MTT comprises a set of tools for modelling dynamic physical systems using the bond-graph methodology and transforming these models into representations suitable for analysis, control and simulation.


The MUltifrontal Massively Parallel Solver (MUMPS) is a package for solving systems of linear equations of the form Ax=b, where A is a sparse matrix that can be either unsymmetric, symmetric positive definite, or general symmetric, on distributed memory computers. MUMPS implements a direct method based on a multifrontal approach which performs a Gaussian factorization.


MurCSS (Murphy-Epstein decomposition and Continuous Ranked Probability Skill Score) is a tool for standardized evaluation of decadal hindcast-prediction systems written in Python using CDO. It analyzes decadal hindcast experiments in a deterministic and probabilistic way.

MurCSS: A Tool for Standardized Evaluation of Decadal Hindcast Systems (online article) -


The Multiscale Coupling Library and Environment is a portable framework to do multiscale modeling and simulation on distributed computing resources. The generic coupling mechanism of MUSCLE is suitable for many types of multiscale applications, notably for multiscale models as defined by the MAPPER project or complex automata as defined in the COAST project. Submodels can be implemented from scratch, but legacy code can also be used with only minor adjustments. The runtime environment solves common problems in distributed computing and couples submodels of a multiscale model, whether they are built for high-performance supercomputers or for local execution. MUSCLE supports Java, C, C++, Fortran, Python, MATLAB and Scala code, using MPI, OpenMP, or threads.



The purpose of beets is to get your music collection right once and for all. It catalogs your collection, automatically improving its metadata as it goes using the MusicBrainz database. Then it provides a bouquet of tools for manipulating and accessing your music.

Because beets is designed as a library, it can do almost anything you can imagine for your music collection. Via plugins, beets becomes a panacea:

  • Fetch or calculate all the metadata you could possibly need: album art, lyrics, genres, tempos, ReplayGain levels, or acoustic fingerprints.

  • Get metadata from MusicBrainz, Discogs, or Beatport. Or guess metadata using songs' filenames or their acoustic fingerprints.

  • Transcode audio to any format you like.

  • Check your library for duplicate tracks and albums or for albums that are missing tracks.

  • Browse your music library graphically through a Web browser and play it in any browser that supports HTML5 Audio.


Copies between local file systems are a daily activity. Files are constantly being moved to locations accessible by systems with different functions and/or storage limits, being backed up and restored, or being moved due to upgraded and/or replaced hardware. Hence, maximizing the performance of copies as well as checksums that ensure the integrity of copies is desirable to minimize the turnaround time of user and administrator activities. Modern parallel file systems provide very high performance for such operations using a variety of techniques such as striping files across multiple disks to increase aggregate I/O bandwidth and spreading disks across multiple servers to increase aggregate interconnect bandwidth.

To achieve peak performance from such systems, it is typically necessary to utilize multiple concurrent readers/writers from multiple systems to overcome various single-system limitations such as number of processors and network bandwidth. The standard cp and md5sum tools of GNU coreutils found on every modern Unix/Linux system, however, utilize a single execution thread on a single CPU core of a single system, hence cannot take full advantage of the increased performance of clustered file systems.

Mutil provides mcp and msum, which are drop-in replacements for cp and md5sum that utilize multiple types of parallelism to achieve maximum copy and checksum performance on clustered file systems. Multi-threading is used to ensure that nodes are kept as busy as possible. Read/write parallelism allows individual operations of a single copy to be overlapped using asynchronous I/O. Multi-node cooperation allows different nodes to take part in the same copy/checksum. Split file processing allows multiple threads to operate concurrently on the same file. Finally, hash trees allow inherently serial checksums to be performed in parallel.


The Muster library provides implementations of serial and parallel K-Medoids clustering algorithms. It is intended as a general framework for parallel cluster analysis, particularly for performance data analysis on systems with very large numbers of processes.

The parallel implementations in the Muster are designed to perform well even in environments where the data to be clustered is entirely distributed. For example, many performance tools need to analyze one data element from each process in a system. To analyze this data efficiently, clustering algorithms that move as little data as possible are required. In Muster, we exploit sampled clustering algorithms to realize this efficiency.

The parallel algorithms in Muster are implemented using the Message Passing Interface (MPI), making them suitable for use on many of the world’s largest supercomputers. They should, however, also run efficiently on your laptop.


Parallel wavelet compression.

National Data Service

The National Data Service is an emerging vision of how scientists and researchers across all disciplines can find, reuse, and publish data. It is an international federation of data providers, data aggregators, community-specific federations, publishers, and cyberinfrastructure providers. It builds on the data archiving and sharing efforts under way within specific communities and links them together with a common set of tools.


Navigation and estimation tools written in Python.


Parallel computation is widely employed in scientific researches, engineering activities and product development. Parallel program writing itself is not always a simple task depending on problems solved. Large-scale scientific computing, huge data analyses and precise visualizations, for example, would require parallel computations, and the parallel computing needs the parallelization techniques. In this Chapter a parallel program generation support is discussed, and a computer-assisted parallel program generation system P-NCAS is introduced. Computer assisted problem solving is one of key methods to promote innovations in science and engineering, and contributes to enrich our society and our life toward a programming-free environment in computing science. Problem solving environments (PSE) research activities had started to enhance the programming power in 1970’s. The P-NCAS is one of the PSEs; The PSE concept provides an integrated human-friendly computational software and hardware system to solve a target class of problems.


NCAR Command Language (NCL) is an interpreted language designed specifically for scientific data analysis and visualization. NCL has robust file input and output. It can read and write netCDF-3, netCDF-4 classic, netCDF-4, HDF4, binary, and ASCII data. It can read HDF-EOS2, HDF-EOS5, GRIB1, GRIB2, and OGR files (shapefiles, MapInfo, GMT, Tiger). It can be built as an OPeNDAP client.

NCL visualizations are world class and highly customizable, and use the separte NCAR Graphics package. Both NCL and NCAR Graphics are released as one package in source code or pre-compiled binary format.

NCL can be run in interactive mode, where each line is interpreted as it is entered at your workstation, or it can be run in batch mode as an interpreter of complete scripts. You can also use command line options to set options or variables on the NCL command line.

See also PyNIO.


NcSOS adds an OGC SOS service to datasets in your existing THREDDS server. It complies with the IOOS SWE Milestone 1.0 templates and requires your datasets be in any of the CF 1.6 Discrete Sampling Geometries.

NcSOS acts like other THREDDS services (such an OPeNDAP and WMS) where as there are individual service endpoints for each dataset. It is best to aggregate your files and enable the NcSOS service on top of the aggregation. i.e. The NcML aggregate of hourly files from an individual station would be a good candidate to serve with NcSOS. Serving the individual hourly files with NcSOS would not be as beneficial.

You will need a working THREDDS installation of a least version 4.3.16 to run NcSOS.


The numerical differentiation library (NDL) used for the numerical estimation of first and second order partial derivatives of a function by finite differencing. In this version we have restructured the serial implementation of the code so as to achieve optimal task-based parallelization. The pure shared-memory parallelization of the library has been based on the lightweight OpenMP tasking model allowing for the full extraction of the available parallelism and efficient scheduling of multiple concurrent library calls. On multicore clusters, parallelism is exploited by means of TORC, an MPI-based multi-threaded tasking library. The new MPI implementation of NDL provides optimal performance in terms of function calls and, furthermore, supports asynchronous execution of multiple library calls within legacy MPI programs. In addition, a Python interface has been implemented for all cases, exporting the functionality of our library to sequential Python codes.


Neko is an high-level dynamicly typed programming language. It can be used as an embedded scripting language. It has been designed to provide a common runtime for several different languages. Learning and using Neko is very easy. You can easily extend the language with C libraries. You can also write generators from your own language to Neko and then use the Neko Runtime to compile, run, and access existing libraries.


Neo is minimal and fast Go Web Framework with extremely simple API.

During development you will enjoy in automatic reruning and recompiling your Neo application when source changes.

Build your Neo Application in few lines of code.


An open-source NoSQL graph database implemented in Java and Scala. With development starting in 2003, it has been publicly available since 2007. The source code and issue tracking are available on GitHub, with support readily available on Stack Overflow and the Neo4j Google group.

Neo4j implements the Property Graph Model efficiently down to the storage level. As opposed to graph processing or in-memory libraries, Neo4j provides full database characteristics including ACID transaction compliance, cluster support, and runtime failover, making it suitable to use graph data in production scenarios.

Neo4j’s free and open-source Community edition is a high-performance, fully ACID-transactional database. The Community edition includes (but is not limited to) all the functionality described previously in this section.


GraphGists are an easy way to create and share documents containing not just prose, structure and pictures but most importantly example graph models and use-cases expressed in Neo4j’s query language Cypher. These documents are written in AsciiDoc — the simple, textual markup language — and rendered in your browser as rich and interactive web pages that you can quickly evolve from describing simple howtos or questions to providing an extensive use-case specification.


The NESToolbox is a collection of algorithms to perform similarity estimation for irregularly sampled time series as they arise for example in the geosciences. It is implemented as a toolbox for the widely used software MATLAB and the freely available open-source software OCTAVE.

The installation of the Python portation is simple: just copy the in your working directory.



CF-compliant NetCDF for radial data.


The "Climate Model Output Rewriter" (CMOR, pronounced "Seymour") comprises a set of C-based functions, with bindings to both Python and FORTRAN 90, that can be used to produce CF-compliant netCDF files that fulfill the requirements of many of the climate community’s standard model experiments. These experiments are collectively referred to as MIP’s and include, for example, AMIP, CMIP, CFMIP, PMIP, APE, and IPCC scenario runs. The output resulting from CMOR is "self-describing" and facilitates analysis of results across models.

Much of the metadata written to the output files is defined in MIP-specific tables, typically made available from each MIP’s web site. CMOR relies on these tables to provide much of the metadata that is needed in the MIP context, thereby reducing the programming effort required of the individual MIP contributors.


The NetCDF Fortran libraries.


The software package to be disclosed,, is a front end to an existing free software package, CMOR2 (Climate Model Output Rewriter), written by Lawrence Livermore National Laboratory (LLNL), and reads in a multitude of standard data formats, such as netcdf3, netcdf4, Grads control files, Matlab data files or a list of netcdf files, and converts the data into the CMIP5 data format to allow publication on the Earth System Grid Federation (ESGF) data node.


Pythonic interface to netCDF4 via h5py.


The NCIO (NetCDF Input/Output) module has been designed as an interface to the NetCDF library with simplicity and ease of use in mind. While this implies that some NetCDF functionality is masked from the user, the subroutines provided here are adequate for basic serial reading and writing tasks of up to 6-D data arrays along with corresponding data attributes.


The NCO toolkit manipulates and analyzes data stored in netCDF-accessible formats, including DAP, HDF4, and HDF5. It exploits the geophysical expressivity of many CF (Climate & Forecast) metadata conventions, the flexible description of physical dimensions translated by UDUnits, the network transparency of OPeNDAP, the storage features (e.g., compression, chunking, groups) of HDF (the Hierarchical Data Format), and many powerful mathematical and statistical algorithms of GSL (the GNU Scientific Library).

The netCDF Operators (NCO) comprise a dozen standalone, command-line programs that take netCDF, HDF, and/or DAP files as input, then operate (e.g., derive new data, compute statistics, print, hyperslab, manipulate metadata) and output the results to screen or files in text, binary, or netCDF formats. NCO aids analysis of gridded scientific data. The shell-command style of NCO allows users to manipulate and analyze files interactively, or with expressive scripts that avoid some overhead of higher-level programming environments.

NetCDF Java

The NetCDF-Java library implements a Common Data Model (CDM), a generalization of the NetCDF, OpenDAP and HDF5 data models. The library is a prototype for the NetCDF-4 project, which provides a C language API for the "data access layer" of the CDM, on top of the HDF5 file format. The NetCDF-Java library is a 100% Java framework for reading netCDF and other file formats into the CDM, as well as writing to the netCDF-3 file format. Writing to the netCDF-4 file format requires installing the netCDF C library. The NetCDF-Java library also implements NcML, which allows you to add metadata to CDM datasets, as well as to create virtual datasets through aggregation. The THREDDS Data Server (TDS) is built on top of the NetCDF-Java library.


Python/numpy interface to the netCDF C library.


Wrapper around python-netCDF4 that allows Coordinate subscripting, similar to NCL.


A library providing high-performance parallel I/O while still maintaining file-format compatibility with Unidata’s NetCDF, specifically the formats of CDF-1 and CDF-2. Although NetCDF supports parallel I/O starting from version 4, the files must be in HDF5 format. PnetCDF is currently the only choice for carrying out parallel I/O on files that are in classic formats (CDF-1 and 2). In addition, PnetCDF supports the CDF-5 file format, an extension of CDF-2, that supports more data types and allows users to define large dimensions, attributes, and variables.


A for converting NetCDF files from time-slice (history) format to time-series (single-variable). This performs time-slice to time-series convertion of NetCDF files, compliant with the CF 1.6 Conventions. The PyReshaper package is designed to run in parallel to maximize performance, with the parallelism implemented over variables (i.e., task parallelism). This means that the maximum parallelism achieveable for a given operation is one core/processor per variables in the time-slice NetCDF files.


A Python tools for staggered grids.


A Python API to utilize data written using the netcdf unstructured grid conventions.


A web-based service that provides an easy, wizard-based interface for data collectors to transform their datalogger generated ASCII output into Climate and Forecast (CF) compliant netCDF files, complete with metadata describing what data are contained in the file, the instruments used to collect the data, and other critical information that otherwise may be lost in one of many dreaded README files.


Running IPython notebook servers on Amazon’s EC2.


We introduce the very first NPU compilation workflow, called NPiler, which automatically converts annotated regions of imperative code to a neural network representation. First, the programmer annotates the regions of imperative code which he/she wants to transform to a neural representation. NPiler accepts inputs from the programmer to train the network. During this step, NPiler automatically observes the input and output pairs to the annotated regions to collect training and testing data. Then, NPiler trains each possible NPU topology given constraints provide by the programmer. The outcome of this exploration provides the best possible NPU topology in terms of minimum root mean square error (RMSE) on test data. Finally, our compiler replaces the annotated regions with the final neural network representation. We use FANN library to execute the neural network representation. We released NPiler with seven representative benchmarks from diverse domains to evaluate our NPU compilation workflow.


Numba gives you the power to speed up your applications with high performance functions written directly in Python. With a few annotations, array-oriented and math-heavy Python code can be just-in-time compiled to native machine instructions, similar in performance to C, C++ and Fortran, without having to switch languages or Python interpreters.



A scientific computing library for the JVM. It is meant to be used in production environments rather than as a research tool, which means routines are designed to run fast with minimum RAM requirements. A usability gap has separated Java, Scala and Clojure programmers from the most powerful tools in data analysis, like NumPy or Matlab. With ND4J, intuitive scientific computing tools once limited to the Python community are now open source, distributed and integrated with GPUs on the JVM.


Oasis is an open source finite element Navier-Stokes solver written from scratch in Python using building blocks from FEniCS. The solver is unstructured, runs with MPI and interfaces, through FEniCS, to the state-of-the-art linear algebra backend PETSc. Oasis advocates a high-level, programmable Python user interface, where the user is placed in complete control of every aspect of the solver.

There are currently two solvers implemented, one for steady-state and one for transient flows. The transient solver uses the fractional step algorithm for any finite element discretization of the actual Navier Stokes equations. The steady-state solver is coupled using a mixed space for velocity and pressure.


The inability to predict lasting languages and architectures led us to develop OCCA, a C++ library focused on host-device interaction. Using run-time compilation and macro expansions, the result is a novel single kernel language that expands to multiple threading languages. Currently, OCCA supports device kernel expansions for the OpenMP, OpenCL, and CUDA platforms. Computational results using finite difference, spectral element and discontinuous Galerkin methods show OCCA delivers portable high performance in different architectures and platforms.

High-order finite-difference methods are commonly used in wave propagators for industrial subsurface imaging algorithms. Computational aspects of the reduced linear elastic vertical transversely isotropic propagator are considered. Thread parallel algorithms suitable for implementing this propagator on multi-core and many-core processing devices are introduced. Portability is addressed through the use of the OCCA runtime programming interface. Finally, performance results are shown for various architectures on a representative synthetic test case.


The Open Cloud Computing Interface comprises a set of open community-lead specifications delivered through the Open Grid Forum. OCCI is a Protocol and API for all kinds of Management tasks. OCCI was originally initiated to create a remote management API for IaaS model based Services, allowing for the development of interoperable tools for common tasks including deployment, autonomic scaling and monitoring. It has since evolved into a flexible API with a strong focus on integration, portability, interoperability and innovation while still offering a high degree of extensibility. The current release of the Open Cloud Computing Interface is suitable to serve many other models in addition to IaaS, including e.g. PaaS and SaaS.


OpenCL implementations are provided as ICD (Installable Client Driver). An OpenCL program can use several ICD thanks to the use of an ICD Loader as provided by this project. This free ICD Loader can load any (free or non free) ICD.

This package aims at creating an Open Source alternative to vendor specific OpenCL ICD loaders. The main difficulties to create such software is that the order of function pointers in a structure is not publicy available. This software maintains a YAML database of all known and guessed entries. This package also delivers a skeleton of bindings to incorporate inside an OpenCL implementation to give it ICD functionalities.


The Open Community Runtime project is creating an application building framework that explores new methods of high-core-count programming. The initial focus is on HPC applications. Its goal is to create a tool that helps app developers improve the power efficiency, programmability, and reliability of their work while maintaining app performance.

OCR will help the app developer with the complex process of writing multi-core apps create by masking the effort to manage event-driven tasks, events (which embody dataflow and code flow dependencies), memory data blocks (with semantic annotations for runtime use), machine description facilities, and more.

This is a large open source project distributed under the GPL-2.0+ open source license. With a mature and established codebase containing almost 8 million lines of code, Linux ACPI is written largely in C. OCR was originally unveiled at Supercomputing Conference 2012 (SC12) with a major new release (v0.8) introduced at Supercomputing 2013 (SC13). Community participation is encouraged, both for runtime enhancement as well as exploration of algorithm/application decomposition for new programming models.


GNU Octave is a high-level interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments. It also provides extensive graphics capabilities for data visualization and manipulation. Octave is normally used through its interactive command line interface, but it can also be used to write non-interactive programs. The Octave language is quite similar to Matlab so that most programs are easily portable.


The octavemagic extension provides the ability to interact with Octave. It is provided by the oct2py package, which may be installed using pip or easy_install.


OData (Open Data Protocol) is an OASIS standard that defines the best practice for building and consuming RESTful APIs. OData helps you focus on your business logic while building RESTful APIs without having to worry about the approaches to define request and response headers, status codes, HTTP methods, URL conventions, media types, payload formats and query options etc. OData also guides you about tracking changes, defining functions/actions for reusable procedures and sending asynchronous/batch requests etc. Additionally, OData provides facility for extension to fulfil any custom needs of your RESTful APIs. OData RESTful APIs are easy to consume. The OData metadata, a machine-readable description of the data model of the APIs, enables the creation of powerful generic client proxies and tools. Some of them can help you interact with OData even without knowing anything about the protocol.


ODataPy is an open-source Python library that implements the Open Data Protocol (OData). It supports the OData protocol version 4.0. It is built on top of ODataCpp using language binding. It is under development and currently serves only parts of client and client side proxy generation (code gen) aspects of OData.


Odata Server with support for MySQL and for BLOBs managed by Leveled.


The OpenFabrics Enterprise Distribution (OFED™) is open-source software for RDMA and kernel bypass applications. OFED is used in business, research and scientific environments that require highly efficient networks, storage connectivity and parallel computing. The software provides high performance computing sites and enterprise data centers with flexibility and investment protection as computing evolves towards applications that require extreme speeds, massive scalability and utility-class reliability.

OFED includes kernel-level drivers, channel-oriented RDMA and send/receive operations, kernel bypasses of the operating system, both kernel and user-level application programming interface (API) and services for parallel message passing (MPI), sockets data exchange (e.g., RDS, SDP), NAS and SAN storage (e.g. iSER, NFS-RDMA, SRP) and file system/database systems.

The network and fabric technologies that provide RDMA performance with OFED include: legacy 10 Gigabit Ethernet, iWARP for Ethernet, RDMA over Converged Ethernet (RoCE), and 10/20/40 Gigabit InfiniBand.


OFF, an open source (free software) code for performing fluid dynamics simulations, is presented. The aim of OFF is to solve, numerically, the unsteady (and steady) compressible Navier–Stokes equations of fluid dynamics by means of finite volume techniques: the research background is mainly focused on high-order (WENO) schemes for multi-fluids, multi-phase flows over complex geometries. To this purpose a highly modular, object-oriented application program interface (API) has been developed. In particular, the concepts of data encapsulation and inheritance available within Fortran language (from standard 2003) have been stressed in order to represent each fluid dynamics “entity” (e.g. the conservative variables of a finite volume, its geometry, etc…) by a single object so that a large variety of computational libraries can be easily (and efficiently) developed upon these objects.


See also Leaflet and MapFish.


A reference implementation of the OGC Sensor Observation Service specification (version 2.0).


The CEDA OGC Web Services framework (COWS) is a Python software framework developed at the Centre of Environmental Data Archival for implementing Open Geospacial Consortium web service standards.


GeoJModelBuilder couples geosprocessing Web services, NASA World Wind and Sensor Web services to support geoprocessing modeling and environmental monitoring.The main goal of GeoJModelBuilder is to bring an easy-to-use tool to the geoscientific community.

The tool can allow users to drag and drop various geospatial services to visually generate workflows and interact with the workflows in a virtual globe environment. It also allows users to audit trails of workflow executions, check the provenance of data products, and support scientific reproducibility.

The programming language used for the development is Java due to its platform-independent feature. The tool can be operated on any operating systems such as Windows or Unix/Linux that supports Java.


A catalog application to manage spatially referenced resources. It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer. GeoNetwork has been developed to connect spatial information communities and their data using a modern architecture, which is at the same time powerful and low cost, based on the principles of Free and Open Source Software (FOSS) and International and Open Standards for services and protocols (a.o. from ISO/TC211 and OGC).

The software provides an easy to use web interface to search geospatial data across multiple catalogs, combine distributed map services in the embedded map viewer, publish geospatial data using the online metadata editing tools and optionally the embedded GeoServer map server.

You will find support for a range of standards. Metadata standards (ISO19115/ISO19119/ISO19110 following ISO19139, FGDC and Dublin Core), Catalog interfaces (OGC-CSW2.0.2 ISO profile client and server, OAI-PMH client and server, GeoRSS server, GEO OpenSearch server, WebDAV harvesting, GeoNetwork to GeoNetwork harvesting support) and Map Services interfaces (OGC-WMS, WFS, WCS, KML and others) through the embedded GeoServer map server.


GeoNode is a web-based application and platform for developing geospatial information systems (GIS) and for deploying spatial data infrastructures (SDI). Data management tools built into GeoNode allow for integrated creation of data, metadata, and map visualizations. Each dataset in the system can be shared publicly or restricted to allow access to only specific users. Social features like user profiles and commenting and rating systems allow for the development of communities around each platform to facilitate the use, management, and quality control of the data the GeoNode instance contains.


GeoPackage is the modern alternative to formats like SDTS and Shapefile. At its core, GeoPackage is simply a SQLite database schema. If you know SQLite, you are close to knowing GeoPackage. Install Spatialite – the premiere spatial extention to SQLite – and you get all the performance of a spatial database along with the convenience of a file-based data set that can be emailed, shared on a USB drive or burned to a DVD.

GeoPackage was carefully designed this way to facilitate widespread adoption and use of a single simple file format by both commercial and open-source software applications — on enterprise production platforms as well as mobile hand-held devices. GeoPackage is a standard from the Open Geospatial Consortium. It was designed and prototyped following a multi-year, open process of requirements testing and public input. It is designed for extension. So if you need more than the core GeoPackage feature set, join OGC’s open process to standardize community-tested enhancements.


GeoWebCache is a Java web application used to cache map tiles coming from a variety of sources such as OGC Web Map Service (WMS). It implements various service interfaces (such as WMS-C, WMTS, TMS, Google Maps KML, Virtual Earth) in order to accelerate and optimize map image delivery. It can also recombine tiles to work with regular WMS clients.


An IOOS customized build of the 52°North Sensor Observation Service (SOS). This extends the stock upstream 52°North (52n) SOS with IOOS specific encoding formats, test data, and more. The custom encoding formats include:

  • enhanced GetCapabilitiesResponse (extra metadata)

  • enhanced SensorML (extra metadata and network/station/sensor hierarchies)

  • O&M and SWE (IOOS m1.0 SOS format)

  • netCDF (CF 1.6/ACDD 1.1/NODC 1.0/IOOS 1.0 conventions)


A XML Parser library for Sensor Web Enablement (SWE) Common Data Model (CDM).


An OGC SOS server implementation written in Python. istSOS allows for managing and dispatch observations from monitoring sensors according to the Sensor Observation Service standard. The project provides also a Graphical user Interface that allows for easing the daily operations and a RESTful Web api for automatizing administration procedures.


Compliant WMS server written in Python and using Mapnik C++ library.


OWSLib is a Python package for client programming with Open Geospatial Consortium (OGC) web service (hence OWS) interface standards, and their related content models.


An OGC CSW server implementation written in Python that allows for the publishing and discovery of geospatial metadata, providing a standards-based metadata and catalogue component of spatial data infrastructures.


Python library for collecting Met/Ocean observations. Pyoos attempts to fill the need for a high level data collection library for met/ocean data publically available through many different websites and webservices.

Pyoos will collect and parse the following data services into the Paegan Discrete Geometry CDM:


PySOS, a python-based implementation of the OGC SOS standard. PySOS is a lightweight set of scripts that work in conjunction with a web server to serve data from a relational database.


The OGC Styled Layer Descriptor (SLD) profile of the WMS standard defines encoding that extends the WMS standard to allow user-defined symbolization and coloring of geographic feature and coverage data. It addresses the need for users and software to be able to control the visual portrayal of the geospatial data. The ability to define styling rules requires a styling language that the client and server can both understand.

This is a Python library for reading, writing, and manipulating SLD files.


An implementation of the Web processing Service standard from Open Geospatial Consortium. PyWPS offers an environment for programming own processes (geofunctions or models) which can be accessed from the public. The main advantage of PyWPS is, that it has been written with native support for GRASS GIS. Access to GRASS modules via web interface should be as easy as possible.


A client for Sensor Observation Services (SOS) as specified by the Open Geospatial Consortium (OGC). It allows users to retrieve metadata from SOS web services and to interactively create requests for near real-time observation data based on the available sensors, phenomena, observations et cetera using thematic, temporal and spatial filtering.


SOS.js is a JavaScript library to browse, visualise, and access, data from an Open Geospatial Consortium (OGC) Sensor Observation Service (SOS). The library consists of a number of modules, which along with their dependencies build a layered abstraction for communicating with a SOS.

The core module - SOS.js, contains a number of objects that encapsulate core concepts of SOS, such as managing the service connection parameters, the service’s capabilities document, methods to access the service’s Features of Interest (FOIs), offerings, observed properties etc. It also contains various utility functions, available as methods of the SOS.Utils object. This module is built on top of OpenLayers, for low-level SOS request/response handling.

The user interface module - SOS.Ui.js, contains the UI components of the library. These components can be used standalone, but are also brought together in the default SOS.App object as a (somewhat) generic web application. This module is built on top of OpenLayers which provides simple mapping for discovery; jQuery for the UI and plumbing; and flot, which is a jQuery plugin, for the plotting.


Sensor Observation Service (SOS) and data management.


A library extending the basic SQLite core in order to get a full fledged Spatial DBMS, really simple and lightweight, but mostly OGC-SFS compliant.


A collection of open source Command Line Interface (CLI) tools supporting SpatiaLite.


The web-based Earth Observation Monitor (webEOM) provides easy access and visualization for spatial time-series data. It is based on a spatial data infrastructure containing a Metadata catalogue, visualization and download services as well as processing services. These services are compliant to specifications of the Open Geospatial Consortium (OGC).

webEOM is designed for an easy usage. Time-series plots can be generated within a few clicks without data processing needs by the user. Further developments are planned for 2014, e.g. users will have the possibilities to generate further plots specified by individual parameters and users can specifiy monitoring parameters for individual areas and datasets.


Data download from multiple data providers as well as data integration and provision with OGC compliant web services are implemented in Python programming language.


The OmicABEL (pronounced as "amicable") package allows rapid mixed-model based genome-wide association analysis; it efficiently handles large datasets, and both single trait and multiple trait ("omics") analysis.

CLAK-GWAS is a software for performing Genome-Wide Association Studies (GWAS). It provides a high-performance implementation of two algorithms, CLAK-Chol and CLAK-Eig, for GWAS involving single and multiple phenotypes, respectively.


Omni compiler is a collection of programs and libraries that allow users to build code transformation compilers. Omni Compiler is to translate C and Fortran programs with XcalableMP and/or OpenACC directives into parallel code suitable for compiling with a native compiler linked with the Omni Compiler runtime library.

Omni compiler consists of following components:

  • XcalableMP - XcalableMP is a directive-based language extension of C and Fortran for distributed memory systems. XcalableMP allows users to develop a parallel application and to tune its performance with minimal and simple notation.

  • OpenACC - OpenACC is a directive-based programming interface for accelerators such as GPGPU. OpenACC allows users to express the offloading of data and computations to accelerators to simplify the porting process for legacy CPU-based applications.

  • XcodeML - XcodeML is an intermediate code written in XML for C and Fortran languages.


OMP2HMPP a tool that, automatically translates a high-level C source code into HMPP. The generated version rarely will differs from a hand-coded HMPP version, and will provide an important speedup, near 113%, that could be later improved by hand-coded CUDA.


OMP2MPI automatically generates MPI source code from OpenMP. Allowing that the program exploits non shared-memory architectures such as cluster, or Network-on-Chip based(NoC-based) Multiprocessors-System-onChip (MPSoC). OMP2MPI gives a solution that allow further optimization by an expert that want to achieve better results. Tested set of problems obtains in most of cases with more than 20x of speedup for 64 cores compared to the sequential version and an average speedup over 4x compared to OpenMP.


OmpSs is an effort to integrate features from the StarSs programming model developed by BSC into a single programming model. In particular, our objective is to extend OpenMP with new directives to support asynchronous parallelism and heterogeneity (devices like GPUs). However, it can also be understood as new directives extending other accelerator based APIs like CUDA or OpenCL. Our OmpSs environment is built on top of our Mercurium compiler and Nanos++ runtime system.

Asynchronous parallelism is enabled in OmpSs by the use data-dependencies between the different tasks of the program. To support heterogeneity a new construct is introduced: the target construct.


DLB is a dynamic library designed to speed up hybrid applications by improving its load balance. DLB will redistribute the computational resources of the second level of parallelism to improve the load balance of the outer level of parallelism.


A runtime designed to serve as runtime support in parallel environments. It is mainly used to support OmpSs, a extension to OpenMP developed at BSC. It also has modules to support OpenMP and Chapel. Nanospp provides services to support task parallelism using synchronizations based on data-dependencies. Data parallelism is also supported by means of services mapped on top of its task support. Task are implemented as user-level threads when possible. It also provides support for maintaining coherence across different address spaces (such as with GPUs or cluster nodes). It provides software directory and cache modules to this end.


Mercurium is a source-to-source compilation infrastructure aimed at fast prototyping. Current supported languages are C, C++. Mercurium is mainly used in Nanos environment to implement OpenMP but since it is quite extensible it has been used to implement other programming models or compiler transformations, examples include Cell Superscalar, Software Transactional Memory, Distributed Shared Memory or the ACOTES project, just to name a few.

Extending Mercurium is achieved using a plugin architecture, where plugins represent several phases of the compiler. These plugins are written in C++ and dynamically loaded by the compiler according to the chosen configuration. Code transformations are implemented in terms of source code (there is no need to modify or know the internal syntactic representation of the compiler).


An open-source framework for the execution of unstructured grid applications on clusters of GPUs or multi-core CPUs. P2 is an API with associated libraries and preprocessors to generate parallel executables for applications on unstructured grids.

The OP2 project is developing an open-source framework for the execution of unstructured grid applications on clusters of GPUs or multi-core CPUs. Although OP2 is designed to look like a conventional library, the implementation uses source-source translation to generate the appropriate back-end code for the different target platforms.


Framework for performance-portable parallel computations on unstructured meshes.


The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator. OpenACC is designed for portability across operating systems, host CPUs, and a wide range of accelerators, including APUs, GPUs, and many-core coprocessors. The directives and programming model defined in the OpenACC API document allow programmers to create high-level host+accelerator programs without the need to explicitly initialize the accelerator, manage data or program transfers between the host and accelerator, or initiate accelerator startup and shutdown.

All of these details are implicit in the programming model and are managed by the OpenACC API-enabled compilers and runtimes. The programming model allows the programmer to augment information available to the compilers, including specification of data local to an accelerator, guidance on mapping of loops onto an accelerator, and similar performance-related details.

OpenACC in GCC

This page contains information on GCC’s implementation of the OpenACC specification and related functionality.

KernelGen (LLVM)

A prototype of auto-parallelizing Fortran/C compiler for NVIDIA GPUs, targeting numerical modelling code.


OpenAlea is an open source project primarily aimed at the plant research community. It is a distributed collaborative effort to develop Python libraries and tools that address the needs of current and future works in Plant Architecture modeling. OpenAlea includes modules to analyse, visualize and model the functioning and growth of plant architecture.


OpenARC is a new, open source compiler framework, which provides extensible environment, where various performance optimizations, traceability mechanisms, fault tolerance techniques, etc., can be built for better debuggability/performance/resilience on the complex accelerator computing.


Open CASCADE Technology is a software development kit (SDK) intended for development of applications dealing with 3D CAD data, freely available in open source. It includes a set of C++ class libraries providing services for 3D surface and solid modeling, visualization, data exchange and rapid application development.


OpenCL™ is the first open, royalty-free standard for cross-platform, parallel programming of modern processors found in personal computers, servers and handheld/embedded devices. OpenCL (Open Computing Language) greatly improves speed and responsiveness for a wide spectrum of applications in numerous market categories from gaming and entertainment to scientific and medical software.


The easiest way to get started with OpenCL.

Intel OpenCL

Intel® Code Builder for OpenCL™ API is a comprehensive environment for OpenCL software development on Intel Architecture processors and Intel Xeon Phi™ coprocessors. The Code Builder comprises the Intel implementation of the OpenCL standard and a set of tools for OpenCL application development on Linux* operating systems.


Python cffi OpenCL bindings and helper classes.


Portable Computing Language (pocl) aims to become a MIT-licensed open source implementation of the OpenCL standard which can be easily adapted for new targets and devices, both for homogeneous CPU and heterogenous GPUs/accelerators.

pocl uses Clang as an OpenCL C frontend and LLVM for the kernel compiler implementation, and as a portability layer. Thus, if your desired target has an LLVM backend, it should be able to get OpenCL support easily by using pocl.

The goal is to accomplish improved performance portability using a kernel compiler that can generate multi-work-item work-group functions that exploit various types of parallel hardware resources: VLIW, superscalar, SIMD, SIMT, multicore, multithread …​

Additional purpose of the project is to serve as a research platform for issues in parallel programming on heterogeneous platforms.


OpenClimateGIS (OCGIS) is a Python package designed for geospatial manipulation, subsetting, computation, and translation of climate datasets stored in local NetCDF files or files served through THREDDS data servers. OpenClimateGIS has a straightforward, request-based API that is simple to use yet complex enough to perform a variety of computational tasks. The software is built entirely from open source packages. ClimateTranslator is a new web interface to the OpenClimateGIS functionality.


OpenCMISS libraries and applications provide the foundation for developing computational modelling and visualisation software, particularly targeting bioengineering.


OpenCoarrays is an open-source software project that supports the coarray Fortran (CAF) parallel programming features of the Fortran 2008 standard and several features proposed for Fortran 2015 in the draft Technical Specification TS 18508 Additional Parallel Features in Fortran.

OpenCoarrays provides a compiler wrapper (named "caf"), a runtime library (named "libcaf_mpi.a" by default), and an executable file launcher (named "cafrun"). With OpenCoarrays-aware compilers, the compiler wrapper passes the provided source code to the chosen compiler ("mpif90" by default). For non-OpenCoarrays-aware compilers, the wrapper transforms CAF syntax into OpenCoarrys procedure calls before invoking the chosen compiler on the transformed code. The runtime library supports compiler communication and synchronization requests by invoking a lower-level communication library — the Message Passing Interface (MPI) by default. The launcher passes execution to the chosen communication library’s parallel program launcher ("mpirun" by default).

OpenCoarrays defines an application binary interface (ABI) that translates high-level communication and synchronization requests into low-level calls to a user-specified communication library. This design decision liberates compiler teams from hardwiring communication-library choice into their compilers and it frees Fortran programmers to express parallel algorithms once and reuse identical CAF source with whichever communication library is most efficient for a given hardware platform.


OpenCores is an open source hardware community developing digital open source hardware through electronic design automation, with a similar ethos to the free software movement. OpenCores hopes to eliminate redundant design work and slash development costs.


OPeNDAP stands for "Open-source Project for a Network Data Access Protocol" OPeNDAP is both the name of a non-profit organization and the commonly-used name of a protocol which the OPeNDAP organization has developed. The DAP2 protocol provides a discipline-neutral means of requesting and providing data across the World Wide Web. The goal is to allow end users, whoever they may be, to access immediately whatever data they require in a form they can use, all while using applications they already possess and are familiar with. In the field of oceanography, OPeNDAP has already helped the research community make significant progress towards this end. Ultimately, it is hoped, OPeNDAP will be a fundamental component of systems which provide machine-to-machine interoperability with semantic meaning in a highly distributed environment of heterogeneous datasets. The OPeNDAP organization exists to develop, implement, and promulgate the OPeNDAP protocol. It presents the results of its work freely to the public with the hope that it will be of service in many disciplines and facilitate sharing of and access to their data streams.


Pydap is a pure Python library implementing the Data Access Protocol, also known as DODS or OPeNDAP. You can use Pydap as a client to access hundreds of scientific datasets in a transparent and efficient way through the internet; or as a server to easily distribute your data from a variety of formats.

Pydap includes several handlers, i.e. special Python modules that convert between a given data format and the data model used by Pydap (defined in the pydap.model module). They are necessary in order to Pydap be able to actually serve a dataset. There are handlers for NetCDF, HDF 4 & 5, Matlab, relational databases, Grib 1 & 2, CSV, Seabird CTD files, and a few more.


The goal of the OpenDSA project is to create open-source courseware for use in Data Structures and Algorithms courses, that deeply integrates textbook-quality content with algorithm visualizations and interactive, automatically assessed exercises.


OpenFL is a free and open source software framework and platform for the creation of multi-platform applications and video games. OpenFL programs are written in a single language (Haxe) and may be published to Flash movies, or standalone applications for Microsoft Windows, Mac OS X, Linux, iOS, Android, BlackBerry OS, Firefox OS, HTML5 and Tizen.

OpenFL is designed to mimic Adobe Flash Player, and provides much of the same functionality and API. SWF files created with Adobe Flash Professional or other authoring tools may be used in OpenFL programs.


An open source C++ toolkit designed to assist the creative process by providing a simple and intuitive framework for experimentation. The openFrameworks package is designed to work as a general purpose glue, and wraps together several commonly used libraries, including:

  • OpenGL, GLEW, GLUT, libtess2 and cairo for graphics;

  • rtAudio, PortAudio, OpenAL and Kiss FFT or FMOD for audio input, output and analysis;

  • FreeType for fonts;

  • FreeImage for image saving and loading;

  • Quicktime, GStreamer and videoInput for video playback and grabbing;

  • Poco for a variety of utilities;

  • OpenCV for computer vision; and

  • Assimp for 3D model loading.

The code is written to be massively cross-compatible. Right now we support five operating systems (Windows, OSX, Linux, iOS, Android) and four IDEs (XCode, Code::Blocks, and Visual Studio and Eclipse). The API is designed to be minimal and easy to grasp.


OpenMP (Open Multi-Processing) is an API that supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran,[4] on most processor architectures and operating systems. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.

OpenMP is an implementation of multithreading, a method of parallelizing whereby a master thread (a series of instructions executed consecutively) forks a specified number of slave threads and the system divides a task among them. The threads then run concurrently, with the runtime environment allocating threads to different processors.

IWOMP (International Workshop on OpenMP) -

FORTRAN90 Examples of Parallel Programming with OpenMP -

See also OmpSs.


ForestGOMP is an OpenMP runtime compatible with GCC 4.2, offering a structured way to efficiently execute OpenMP applications onto hierarchical (NUMA) architectures.


OpenNL (Open Numerical Library) is a library for solving sparse linear systems, especially designed for the Computer Graphics community. The goal for OpenNL is to be as small as possible, while offering the subset of functionalities required by this application field. The Makefiles of OpenNL can generate a single .c + .h file, very easy to integrate in other projects. The distribution includes an implementation of our Least Squares Conformal Maps parameterization method. It includes support for CUDA and Fermi architecture (Concurrent Number Cruncher and Nathan Bell’s ELL formats.


Description: OpenSim is a freely available, user extensible software system that lets users develop models of musculoskeletal structures and create dynamic simulations of movement.

OpenSim 3.2 includes an improved scripting interface, accessible through the Graphical User Interface (GUI), Matlab, and now Python. We also added new visualization capabilities and usability improvements in the OpenSim application.


OpenStack is a cloud operating system that controls large pools of compute, storage, and networking resources throughout a datacenter, all managed through a dashboard that gives administrators control while empowering their users to provision resources through a web interface.

OpenStack is a free and open-source cloud computing software platform.[2] Users primarily deploy it as an infrastructure as a service (IaaS) solution. The technology consists of a series of interrelated projects that control pools of processing, storage, and networking resources throughout a data center—which users manage through a web-based dashboard, command-line tools, or a RESTful API. released it under the terms of the Apache License.


Neutron is an OpenStack project to provide "networking as a service" between interface devices (e.g., vNICs) managed by other Openstack services (e.g., nova).


OpenUH is an open source, optimizing compiler suite for C, C++ and Fortran, based on Open64. It supports a variety of architectures including x86-64, IA-32, IA-64, MIPS, and PTX.

OpenUH extends the Open64 OpenMP implementation by adding support for nested parallelism and the tasking features introduced in OpenMP 3.0. The OpenMP runtime library that comes with OpenUH supports several task scheduling strategies, enables selection of more scalable barrier algorithms, and provides an implementation of the OpenMP Collector API for interaction with performance collection tools (including DARWIN). The OpenMP implementation has been successfully tested using a number of applications and validated with the NAS Parallel Benchmarks (NPB) and our OpenMP Validation Suite, developed in collaboration with the High Performance Computing Center Stuttgart (HLRS) from the University of Stuttgart. OpenUH also provides support for Fortran coarrays, an extension that has been adopted in the Fortran 2008 standard. With the use of coarrays, a programmer can easily write parallel Fortran programs for a variety of parallel systems. The OpenUH CAF implementation can work in conjunction with either the GASNet or ARMCI runtime libraries, open-source projects which are freely downloadable online.

To achieve portability, OpenUH is able to emit optimized C or Fortran 77 code that may be compiled by a native compiler on other platforms. The supporting runtime libraries are also portable - the OpenMP runtime library is based on the portable Pthreads interface while the Coarray Fortran runtime library is based on the portable GASNet (or, optionally, ARMCI) communications interfaces.


The openZIM project proposes offline storage solutions for content coming from the Web. The project has two different targets:

  • Definition of the ZIM file format: an open and standardized file format,

  • Implementation of the zimlib: an open source (GPLv2) implementation of the ZIM file format.

See also Kiwix and Internet-in-a-Box.

operating systems


A new research operating system being built from scratch. We are exploring how to structure an OS for future multi- and many-core systems. We are motivated by two closely related trends in hardware design: first, the rapidly growing number of cores, which leads to a scalability challenge, and second, the increasing diversity in computer hardware, requiring the OS to manage and exploit heterogeneous hardware resources.

Barrelfish is “multikernel” operating system [3]: it consists of a small kernel running on each core (one kernel per core), and while rest of the OS is structured as a distributed system of single-core processes atop these kernels. Kernels share no memory, even on a machine with cache-coherent shared RAM, and the rest of the OS does not use shared memory except for transferring messages and data between cores, and booting other cores. Applications can use multiple cores and share address spaces (and therefore cache-coherent shared memory) between cores, but this facility is provided by user-space runtime libraries.


Harvey is an effort to get the Plan 9 code working with gcc and clang. Our aim is to provide a modern, distributed, 64 bit operating system that does away with Unix’s wrinkles and allows for new ways of working.


A portable hybrid distributed OS based on Inferno, LuaJIT and Libuv. Node9 is a hosted 64-bit operating system based on Bell Lab’s Inferno OS, but using the Lua scripting language instead of Limbo and the LuaJIT high peformance virtual machine instead of the DIS virtual machine. It also uses the libuv I/O library for maximum portability, efficient event processing and thread management.

Node9 embraces a highly interactive programming environment optimized for the needs of distributed computing based on the Plan9/Inferno 9p resource sharing protocol, per-process namespace security and application message channels.


A 20mb Linux distro that runs the entire OS as Docker containers.


PerspicuOS is a prototype operating system that realizes the Nested Kernel, a new operating architecture that restricts access to a device’s memory management unit so that it can then perform memory isolation within the kernel. The key challenge that PerspicuOS addresses is how to virtualize the MMU on real hardware, AMD64, in a real operating system, FreeBSD 9.0, while not assuming any hardware privilege separation or kernel integrity properties such as control flow integrity.


A distributed operating system, originally developed at Bell Labs, but now developed and maintained by Vita Nuova® as Free Software. Applications written in Inferno’s concurrent programming language, Limbo, are compiled to its portable virtual machine code (Dis), to run anywhere on a network in the portable environment that Inferno provides. Unusually, that environment looks and acts like a complete operating system.

The use of a high-level language and virtual machine is sensible but mundane. The interesting thing is the system’s representation of services and resources. They are represented in a file-like name hiearchy. Programs access them using only the file operations open, read/write, and close. The files may of course represent stored data, but may also be devices, network and protocol interfaces, dynamic data sources, and services. The approach unifies and provides basic naming, structuring, and access control mechanisms for all system resources. A single file-service protocol (the same as Plan 9’s 9P) makes all those resources available for import or export throughout the network in a uniform way, independent of location. An application simply attaches the resources it needs to its own per-process name hierarchy (name space).

The system can be used to build portable client and server applications. It makes it straightforward to build lean applications that share all manner of resources over a network, without the cruft of much of the Grid software one sees.


Sortix is a small self-hosting Unix-like operating system developed since 2011 aiming to be a clean and modern POSIX implementation. There’s a lot of technical debt that needs to be paid, but it’s getting better. Traditional design mistakes are avoided or aggressively deprecated by updating the base system and ports as needed. The Sortix kernel, standard libraries, and most utilities were written entirely from scratch. The system is halfway through becoming multi-user and while security vulnerabilities are recognized as bugs, it should be considered insecure at this time.


TempleOS is somewhat of a legend in the operating system community. Its sole author, Terry A. Davis, has spent the past 12 years attempting to create a new operating from scratch.


Orange File System is a branch of the Parallel Virtual File System. Like PVFS, Orange is a parallel file system designed for use on high end computing (HEC) systems that provides very high performance access to disk storage for parallel applications. OrangeFS is different from PVFS in that we have developed features for OrangeFS that are not presently available in the PVFS main distribution. While PVFS development tends to focus on specific very large systems, Orange considers a number of areas that have not been well supported by PVFS in the past.

OrangeFS is presently integrated with ROMIO through MPICH2, and includes FUSE support. It has also been integrated with pNFS.


Orcc is an open-source Integrated Development Environment based on Eclipse and dedicated to dataflow programming. The primary purpose of Orcc is to provide developers with a compiler infrastructure to allow software/hardware code to be generated from dataflow descriptions. Orcc does not generate assembly or executable code directly, rather it generates source code that must be compiled by another tool.

Orcc also brings a complete Java-based simulator which allows developers to quickly test their applications without taking in consideration low-level details relative to the target platform. The simulator can be launched directly from eclipse to execute any RVC-CAL application. Indeed, the simulator simply interprets our intermediate representation of networks and actors, but it is however able to perform all basic interactions required to perform a functional validation, such as displaying text, images or videos to the screen.


Orc-apps is a library of open-source applications described in a dynamic dataflow programming way, using the RVC-CAL and FNL languages. The applications are fully compliant with the Orcc toolset.


ORCM was originally developed as an open-source project (under the Open MPI license) by Cisco Systems, Inc to provide a resilient, 100% uptime run-time environment for enterprise-class routers. Based on the Open Run-Time Environment (ORTE) embedded in Open MPI, the system provided launch and execution support for processes executing within the router itself (e.g., computing routing tables), ensuring that a minimum number of copies of each program were always present. Failed processes were relocated based on the concept of fault groups - i.e., the grouping of nodes with common failure modes. Thus, ORCM attempted to avoid cascade failures by ensuring that processes were not relocated onto nodes with a high probability of failing in the immediate future.

The Cisco implementation naturally required a significant amount of monitoring, and included the notion of fault prediction as a means of taking pre-emptive action to relocate processes prior to their node failing. This was facilitated using an analytics framework that allowed users to chain various analysis modules in the data pipeline so as to perform in-flight data reduction.

Subsequently, ORCM was extended by Greenplum to serve as a scalable monitoring system for Hadoop clusters. While ORCM itself had run on quite a few "nodes" in the Cisco router, and its base ORTE platform has been used for years on very large clusters involving many thousands of nodes, this was the first time the ORCM/ORTE platform had been used solely as a system state-of-health monitor with no responsibility for process launch or monitoring. Instead, ORCM was asked to provide a resilient, scalable monitoring capability that tracked process resource utilization and node state-of-health, collecting all the data in a database for subsequent analysis. Sampling rates were low enough that in-flight data reduction was not required, nor was fault prediction considered to be of value in the Hadoop paradigm.


Ori is a distributed file system built for offline operation and empowers the user with control over synchronization operations and conflict resolution. We provide history through light weight snapshots and allow users to verify the history has not been tampered with. Through the use of replication instances can be resilient and recover damaged data from other nodes.


An open-source extensible framework for the definition of domain-specific languages and generation of optimized (C, Fortran, CUDA, OpenCL) code for multiple architecture targets (e.g., CPUs, NVIDIA and AMD GPUs, Intel Phi), including support for empirical autotuning of the generated code.

Orio is a Python framework for transformation and automatically tuning the performance of codes written in different source and target languages, including transformations from a number of simple languages (e.g., a restricted subset of C) to C, Fortran, CUDA, and OpenCL targets. The tool generates many tuned versions of the same operation using different optimization parameters, and performs an empirical search for selecting the best among multiple optimized code variants.


ORSA is an interactive tool for scientific grade Celestial Mechanics computations. Asteroids, comets, artificial satellites, Solar and extra-Solar planetary systems can be accurately reproduced, simulated, and analyzed. One of the main goals is to create a common infrastructure among the existing celestial mechanics programs and standards. The features include:

  • accurate numerical algorithms

  • use of JPL ephemeris files for accurate planets positions

  • Qt-based graphical user interface

  • advanced 2D plotting tool and 3D OpenGL viewer

  • import asteroids and comets from all the known databases (MPC, JPL, Lowell, AstDyS, and NEODyS)

  • integrated download tool to update databases

  • stand alone numerical library liborsa


RendezvousWithVesta is a graphical software tool developed by Pasquale Tricarico at the Planetary Science Institute, in support to the NASA DAWN mission. It allows to accurately simulate the dynamics of a spacecraft orbiting the asteroid (4) Vesta. The motivations for developing this tool are (1) understand how the physical parameters of Vesta affect the stability of low polar orbits; (2) understand how the physical parameters of Vesta and the orbital elements of DAWN affect the coverage of Vesta’s surface; and (3) provide a fast and reliable tool for the generation of orbits suitable for input in the Science Opportunity Analyzer (SOA) tool.

The features include:

  • validated numerical algorithms, tested on NEAR mission data, and capable of accurately reproducing NEAR’s orbit around Eros;

  • complete control over Vesta’s physical properties: mass, mass distribution model, shape model, rotation period, and pole ecliptic latitude and longitude;

  • control over DAWN’s initial orbit around Vesta: epoch, radius, equatorial (Vesta’s equator) inclination, phase angle;

  • export simulations as SPICE kernel files and as ASCII data files;

  • 3D graphical visualization of the numerical simulation, including the ground tracking of DAWN over Vesta’s surface;

  • 2D plot of the altitude of the spacecraft and of the Vesta profile at nadir; and;

  • completely open source and part of the ORSA framework.


SurfaceCoverage is a graphical software tool developed by Pasquale Tricarico at the Planetary Science Institute, in support to the NASA Dawn mission. It allows to estimate the coverage of the surface of asteroid (4) Vesta by the Dawn spacecraft in a wide range of configurations.


The Optimized Sparse Kernel Interface (OSKI) Library is a collection of low-level C primitives that provide automatically tuned computational kernels on sparse matrices, for use in solver libraries and applications. OSKI has a BLAS-style interface, providing basic kernels like sparse matrix-vector multiply and sparse triangular solve, among others. The current implementation targets cache-based superscalar uniprocessor machines, though we are developing extensions for vector architectures, SMPs, and large-scale distributed memory machines.


OSTree is a tool for managing bootable, immutable, versioned filesystem trees. It is not a package system; nor is it a tool for managing full disk images. Instead, it sits between those levels, offering a blend of the advantages (and disadvantages) of both.

You can use any build system you like to place content into it on a build server, then export an OSTree repository via static HTTP. On each client system, "ostree admin upgrade" can incrementally replicate that content, creating a new root for the next reboot. This provides fully atomic upgrades. Any changes made to /etc are propagated forwards, and all local state in /var is shared.

A key goal of the project is to complement existing package systems like RPM and Debian packages, and help further their evolution. In particular for example, RPM-OSTree (linked below) has as a goal a hybrid tree/package model, where you replicate a base tree via OSTree, and then add packages on top.


An open source document mining platform. Read and analyze thousands of documents super quickly. Full text search, topic modeling, coding and tagging, visualizations and more. All in an easy-to use, visual workflow.


ownCloud provides access to your data through a web interface or WebDAV while providing a platform to view, sync and share across devices easily—all under your control. ownCloud’s open architecture is extensible via a simple but powerful API for applications and plugins and works with any storage.


A Python package for arithmetical computations on random variables. The package is capable of performing the four arithmetic operations: addition, subtraction, multiplication and division, as well as computing many standard functions of random variables. Summary statistics, random number generation, plots, and histograms of the resulting distributions can easily be obtained and distribution parameter fitting is also available. The operations are performed numerically and their results interpolated allowing for arbitrary arithmetic operations on random variables following practically any probability distribution encountered in practice. The package is easy to use, as operations on random variables are performed just as they are on standard Python variables. Independence of random variables is, by default, assumed on each step but some computations on dependent random variables are also possible. We demonstrate on several examples that the results are very accurate, often close to machine precision. Practical applications include statistics, physical measurements or estimation of error distributions in scientific computations.


PadicoTM is the runtime infrastructure for the Padico software environment for computational grids. It is composed of a core which provides a high-performance framework for networking and multi-threading, and services plugged into the core. High-performance communications and threads are obtained thanks to Marcel and Madeleine, provided by the PM2 software suite. The PadicoTM core aims at making the different services running at the same time run in a cooperative way rather than competitive.

PadicoTM exhibits standard interface (VIO: virtual sockets; Circuit: Madeleine-like API; etc.) usable by various middleware systems. Thanks to symbol interception by PadicoTM, middleware is unmodified and utilizes PadicoTM communication methods seamlessly. The middleware systems available over PadicoTM are:

  • CORBA implementations: omniORB and Mico

  • the MPI implementations NewMadeleine and GridMPI

  • a Java Virtual Machine based on Kaffe

  • the gSOAP SOAP/Web services development toolkit

  • an implementation of the JXTA P2P specifications called JXTA-C


A Python Common Data Model (CDM) for met/ocean data. Paegan attempts to fill the need for a high level common data model (CDM) library for array based met/ocean data stored in netCDF files or distributed over OPeNDAP.


An open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language.

The two primary data structures of pandas, Series (1-dimensional) and DataFrame (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. For R users, DataFrame provides everything that R’s data.frame provides and much more. pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other 3rd party libraries.


GeoCoon is GIS data analysis Python library, which integrates Pandas data frames with Shapely GIS geometries. The library provides means to load GIS data in a form of Shapely objects into Pandas data frame and analyze data using Pandas idioms. It allows to access attributes and call methods on Shapely geometries in a vectorized manner.

GeoCoon supports:

  • Point, line string and polygon geometries.

  • Vectorized GIS object attribute access and method execution

  • Pandas data selection and split-apply-combine idioms.

  • SQL/MM databases, i.e. PostgreSQL with PostGIS extension

  • Multiple geometry columns in a data frame


GeoPandas is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types. Geometric operations are performed by shapely. Geopandas further depends on fiona for file access and descartes and matplotlib for plotting.


A set of command-line tools for working with tabular data. Pandashells is an attempt to marry the expressive, concise workflow of the shell pipeline with the statistical and visualization tools of the python data-stack.


PaPy, which stands for parallel pipelines in Python, is a highly flexible framework that enables the construction of robust, scalable workflows for either generating or processing voluminous datasets. A workflow is created from user-written Python functions (nodes) connected by pipes (edges) into a directed acyclic graph. These functions are arbitrarily definable, and can make use of any Python modules or external binaries. Given a user-defined


An open-source, multi-platform data analysis and visualization application. ParaView users can quickly build visualizations to analyze their data using qualitative and quantitative techniques. The data exploration can be done interactively in 3D or programmatically using ParaView’s batch processing capabilities.

The ParaView code base is designed in such a way that all of its components can be reused to quickly develop vertical applications. This flexibility allows ParaView developers to quickly develop applications that have specific functionality for a specific problem domain. ParaView runs on distributed and shared memory parallel and single processor systems.Under the hood, ParaView uses the Visualization Toolkit (VTK) as the data processing and rendering engine and has a user interface written using Qt.

Using ParaView to Visualize Scientific Data (online tutorial) -


Python scripting for scientific visualization software ParaView applied to atmospheric NetCDF data. Historically, pv_atmos has been developed to work with geophysical, and in particular, atmospheric model data (hence the name). However, pv_atmos has evolved into a very general package, and contains routines for visualizing netCDF data, and the capability to show arbitrary axes and labels in a large variety of geometries (linear and logarithmic axes, spherical geometry).

Scientific Visualisation of Atmospheric Data with ParaView (online paper) -


A thread-safe, high-performance, robust, memory efficient and easy to use software for solving large sparse symmetric and unsymmetric linear systems of equations on shared-memory and distributed-memory multiprocessors.

Features of the library version: Unsymmetric, structurally symmetric or symmetric systems, real or complex, positive definite or indefinite, hermitian. LU with complete pivoting. Parallel on SMPs and Cluster of SMPs. Automatic combination of iterative and direct solver algorithms to accelerate the solution process for very large three-dimensional systems.


Manipulating, storing and transmogrifying PDF files.


A collaborative PDF manager, which enables researchers, scholars, or students to create an annotated collection of PDF articles.


PEGASUS is a Peta-scale graph mining system, fully written in Java. It runs in parallel, distributed manner on top of Hadoop. Hadoop is a cloud computing platfrom, as well as an open source implementation of MapReduce framework which was originally designed for web-scale data processing by Google.

Existing works on graph mining has limited scalability: usually, the maximum graph size is order of millions. PEGASUS breaks the limit by scaling up the algorithms to billion-scale graphs. The breakthrough was possible by the careful algorithm design and implementation for Hadoop, a massive cloud computing platform.


Pelican is a static site generator, written in Python. The features include:

  • Write your content in reStructuredText, Markdown, or AsciiDoc formats

  • Completely static output is easy to host anywhere

  • Themes that can be customized via Jinja templates

  • Publish content in multiple languages

  • Atom/RSS feeds

  • Code syntax highlighting

  • Import from WordPress, Dotclear, RSS feeds, and other services

  • Modular plugin system and corresponding plugin repository


The pathologically eclectic rubbish lister ain’t dead yet.


A Perl module that makes it possible to write Perl programs in Latin.[+


This software framework implements a NURBS-based Galerkin finite element method (FEM), popularly known as isogeometric analysis (IGA). It is heavily based on PETSc, the Portable, Extensible Toolkit for Scientific Computation. PETSc is a collection of algorithms and data structures for the solution of scientific problems, particularly those modeled by partial differential equations (PDEs). PETSc is written to be applicable to a range of problem sizes, including large-scale simulations where high performance parallel is a must. PetIGA can be thought of as an extension of PETSc, which adds the NURBS discretization capability and the integration of forms. The PetIGA framework is intended for researchers in the numeric solution of PDEs who have applications which require extensive computational resources.


A toolkit for development of scientific applications related to processing observational data. It includes:

  • Fast linear algebra routines, including one of the fastest subroutine for inversion of a square symmetric positively determined matrix in the upper triangular representation.

  • Graphic library DiaGI (Dialog Graphic Interface) which makes a plot of one-dimensional function(s) from one call and allows a user to adjust parameters of the plot interactively.

  • Routine MatView which displays a portion of a big matrix on the screen and allows a user to change the boundaries of the displayed area interactively.

  • A set of routines for manipulation with splines; various routines for multi-dimensional B-spline transform, etc.

  • Various routines for least squares, regression computation, error handler, interface to a low level I/O, date transformation etc.

See fourpack.


Petuum is a distributed machine learning framework. It aims to provide a generic algorithmic and systems interface to large scale machine learning, and takes care of difficult systems "plumbing work" and algorithmic acceleration, while simplifying the distributed implementation of ML programs - allowing you to focus on model perfection and Big Data Analytics. Petuum runs efficiently at scale on research clusters and cloud compute like Amazon EC2 and Google GCE.


The primary goal of the PHAML (Parallel Hierarchical Adaptive MultiLevel method) project is to develop new methods and software for the efficient solution of 2D elliptic partial differential equations (PDEs) on distributed memory parallel computers and multicore computers using adaptive mesh refinement and multigrid solution techniques.


PHCpack is a software package to solve polynomial systems by homotopy continuation methods.

A polynomial system is given as a sequence of polynomials in several variables. Homotopy continuation methods operate in two stages. In the first stage, a family of polynomial systems (the so-called homotopy) is constructed. This homotopy contains a polynomial system with known solutions. In the second stage, numerical continuation methods are applied to track the solution paths defined by the homotopy, starting at the known solutions and leading to the solutions of the given polynomial system.

See also Bertini.


This documentation describes a collection of Python modules to compute solutions of polynomial systems using PHCpack.


A toolbox for developing parallel adaptive finite element programs. PHG deals with conforming tetrahedral meshes and uses bisection for adaptive local mesh refinement and MPI for message passing. PHG has an object oriented design which hides parallelization details and provides common operations on meshes and finite element functions in an abstract way, allowing the users to concentrate on their numerical algorithms.

PHG has a set of rich and easy to use interfaces to other packages, including ParMETIS, PETSc, Hypre, SuperLU, MUMPS, Trilinos, PARPACK, JDBSYM, LOBPCG.


The Pipelined Hybrid-parallel Iterative Solver Toolkit (PHIST) is what its name implies. It contains:

  • an abstract kernel interface layer defining the basic operations commonly used in iterative linear algebra solvers;

  • implementations of the interface using a built-in sample implementation using MPI and OpenMP, GHOST and Trilinos; and

  • various algorithms implemented using the interface layer.


Pizco is Python module/package that allows python objects to communicate via ZMQ. Objects can be exposed to other process in the same computer or over the network, allowing clear separation of concerns, resources and permissions.

As ZMQ is used as the transport layer, communication is fast and efficient, and different protocols are supported. It has a complete test coverage. It runs in Python 3.2+ and requires PyZMQ. It is licensed under BSD.

Platform MPI

IBM® Platform MPI Community Edition is a no-charge community edition of IBM Platform MPI supporting the core MPI features. It is available for download, deployment, and redistribution at no charge. This edition is simple, flexible, powerful, and reliable; easy to install, embed, deploy; embodies core capabilities of Platform MPI for Linux® and Windows®; and provides an optional low cost offering that includes higher rank counts, 24/7 IBM customer support, fix packs, and upgrade protection.


PLUTO is an automatic parallelization tool based on the polyhedral model. The polyhedral model for compiler optimization provides an abstraction to perform high-level transformations such as loop-nest optimization and parallelization on affine loop nests. Pluto transforms C programs from source to source for coarse-grained parallelism and data locality simultaneously. The core transformation framework mainly works by finding affine transformations for efficient tiling and fusion, but not limited to those. OpenMP parallel code for multicores can be automatically generated from sequential C program sections. Outer, inner, or pipelined parallelization is achieved (purely with OpenMP pragrams), besides register tiling and making code amenable to auto-vectorization.

Though the tool is fully automatic (C to OpenMP C), a number of options are provided (both command-line and through meta files) to tune aspects like tile sizes, unroll factors, and outer loop fusion structure. Cloog-ISL is used for code generation.


A simple, accessible HTML5 media player.


A low-level generic runtime system which integrates multithreading management and a high performance multi-cluster communication library. PM2 is an umbrella software suite for high-performance runtime systems. Modules may be installed and used together or separately. The modules are:

  • NewMadeleine - a high performance communication library for clusters

  • PIOMan - a generic I/O manager designed to deal with interactions between communication and multithreading

  • PadicoTM - a component-based high performance communication framework for grid computing that enables a wide variety of middleware systems, e.g. MPI, CORBA, Java RMI, ICE, SOAP, etc.

  • Marcel - a thread library developed to meet the needs of the PM2 multithreaded environment


The Process Management Interface (PMI) has been used for quite some time as a means of exchanging wireup information needed for interprocess communication. Two versions (PMI-1 and PMI-2) have been released as part of the MPICH effort. While PMI-2 demonstrates better scaling properties than its PMI-1 predecessor, attaining rapid launch and wireup of the roughly 1M processes executing across 100k nodes expected for exascale operations remains challenging.

PMI Exascale (PMIx) represents an attempt to resolve these questions by providing an extended version of the PMI standard specifically designed to support clusters up to and including exascale sizes. The overall objective of the project is not to branch the existing pseudo-standard definitions - in fact, PMIx fully supports both of the existing PMI-1 and PMI-2 APIs - but rather to (a) augment and extend those APIs to eliminate some current restrictions that impact scalability, and (b) provide a reference implementation of the PMI-server that demonstrates the desired level of scalability.


Pochoir (pronounced "PO-shwar") is a compiler and runtime system for implementing stencil computations on multicore processors. A stencil defines the value of a grid point in a d-dimensional spatial grid at time t as a function of neighboring grid points at recent times before t. A stencil computation computes the stencil for each grid point over many time steps. Using Pochoir, a user specifies a computing kernel and boundary conditions using a simple stencil language embedded in C. The Pochoir compiler produces cache-efficient multithreaded C code that can be compiled with the Intel 12.0 compiler for C with the Cilk multithreading extensions, which is available as part of the Intel Parallel Computer suite. The Pochoir package contains two main components: a C template library for debugging and testing Pochoir compliance and a domain-specific compiler written in Haskell that produces highly optimized code.


The Poincaré code is a Maple project package that aims to gather significant computer algebra normal form (and subsequent reduction) methods for handling nonlinear ordinary differential equations. As a first version, a set of fourteen easy-to-use Maple commands is introduced for symbolic creation of (improved variants of Poincaré’s) normal forms as well as their associated normalizing transformations. The software is the implementation by the authors of carefully studied and followed up selected normal form procedures from the literature, including some authors’ contributions to the subject. As can be seen, joint-normal-form programs involving Lie-point symmetries are of special interest and are published in CPC Program Library for the first time, Hamiltonian variants being also very useful as they lead to encouraging results when applied, for example, to models from computational physics like Hénon–Heiles.


Polly is a high-level loop and data-locality optimizer and optimization infrastructure for LLVM. It uses an abstract mathematical representation based on integer polyhedra to analyze and optimize the memory access pattern of a program. We currently perform classical loop transformations, especially tiling and loop fusion to improve data-locality. Polly can also exploit OpenMP level parallelism, expose SIMDization opportunities. Work has also be done in the area of automatic GPU code generation.


A tool to study the combinatorics and the geometry of convex polytopes and polyhedra. It is also capable of dealing with simplicial complexes, matroids, polyhedral fans, graphs, tropical objects, and other objects.


Pomegranate is an open source Python application that implements the open Webification (w10n) Science API for major scientific data stores (HDF, NetCDF, etc.). It makes file inner components, such attributes and data arrays, directly addressable and accessible via well-defined and meaningful URLs.

Data exposed by w10n-sci API is readily consumable by any HTTP client. It can be as simple as a command line like curl or wget, or as advanced as a full-fledged HTML5 web application such as REX.

Pomegranate has been included in Taiga, a turnkey software tool that simplifies the use of scientific data.

It can be installed as a command line tool and/or a ReSTful web service.

Source code is available at Open Channel Software. However, please note that Pomegranate alone won’t be enough to establish a w10n-sci service. What you really need is this instruction service-setup.txt, that details the steps necessary to build, install and configure for a complete service. Or rather use a turnkey solution like Taiga, so that you can be up and running in minutes.



Serves a fully RESTful API from any existing PostgreSQL database. It provides a cleaner, more standards-compliant, faster API than you are likely to write from scratch.


We investigate performance improvements for the discrete element method (DEM) used in ppohDEM. First, we use OpenMP and MPI to parallelize DEM for efficient operation on many types of memory, including shared memory, and at any scale, from small PC clusters to supercomputers. We also describe a new algorithm for the descending storage method (DSM) based on a sort technique that makes creation of contact candidate pair lists more efficient. Finally, we measure the performance of ppohDEM using the proposed improvements, and confirm that computational time is significantly reduced. We also show that the parallel performance of ppohDEM can be improved by reducing the number of OpenMP threads per MPI process.


Precimonious employs a dynamic program analysis technique to find a lower floating-point precision that can be used in any part of a program. Precimonious performs a search on the program variables trying to lower their precision subject to accuracy constraints and performance goals. The tool then recommends a type instantiation for these variables using less precision while producing an accurate enough answer without causing exceptions.


PREESM is an open source rapid prototyping tool. It simulates signal processing applications and generates code for heterogeneous multi/many-core embedded systems. Its dataflow language eases the description of parallel signal processing applications.

The PREESM tool inputs are an algorithm graph, an architecture graph, and a scenario which is a set of parameters and constraints that specify the conditions under which the deployment will run. The chosen type of algorithm graph is a parameterized and hierarchical extension of Synchronous Dataflow (SDF) graphs named PiSDF. The architecture graph is named System-Level Architecture Model (S-LAM). From these inputs, PREESM maps and schedules automatically the code over the multiple processing elements and generates multi-core code.

PREESM is an Eclipse plug-in.


Describe your software project just once, using Premake’s simple and easy to read syntax, and build it everywhere.

Generate project files for Visual Studio, GNU Make, Xcode, Code:Blocks, and more across Windows, Mac OS X, and Linux. Use the full featured Lua scripting engine to make build configuration tasks a breeze.


PRIMME is a C library to find a number of eigenvalues and their corresponding eigenvectors of a Real Symmetric, or Complex Hermitian matrix A. Symmetric and Hermitian eigenvalue problems enjoy a remarkable theoretical structure that allows for efficient and stable algorithms for obtaining a few required eigenpairs. This is probably one of the reasons that enabled applications requiring the solution of symmetric eigenproblems to push their accuracy and thus computational demands to unprecedented levels. Materials science, structural engineering, and some QCD applications routinely compute eigenvalues of matrices of dimension more than a million; and often much more than that! Typically, with increasing dimension comes increased ill conditioning, and thus the use of preconditioning becomes essential.[]


A programming language, development environment, and online community. Since 2001, Processing has promoted software literacy within the visual arts and visual literacy within technology. Initially created to serve as a software sketchbook and to teach computer programming fundamentals within a visual context, Processing evolved into a development tool for professionals.

Processing continues to be an alternative to proprietary software tools with restrictive and expensive licenses, making it accessible to schools and individual students. Its open source status encourages the community participation and collaboration that is vital to Processing’s growth. Contributors share programs, contribute code, and build libraries, tools, and modes to extend the possibilities of the software. The Processing community has written more than a hundred libraries to facilitate computer vision, data visualization, music composition, networking, 3D file exporting, and programming electronics.


A collection of classes that performs the heavy lifting for you by writing a minimal amount of code. This library is compatible with both Processing and Processing.js


A Processing library easily extending sketches to distributed display environments.


Processing.js is the sister project of the popular Processing visual programming language, designed for the web. Processing.js makes your data visualizations, digital art, interactive animations, educational graphs, video games, etc. work using web standards and without any plug-ins. You write code using the Processing language, include it in your web page, and Processing.js does the rest.


This project provides a Python package that creates an environment for graphics applications that closely resembles that of the Processing system. The project mission is to implement Processing’s friendly graphics functions and interaction model in Python. Not all of Processing is to be ported, though, since Python itself already provides alternatives for many features of Processing, such as XML parsing. The pyprocessing backend is built upon OpenGL and Pyglet, which provide the actual graphics rendering. Since these are multiplatform, so is pyprocessing.


An independent, open source library collection for computational design tasks with Java & Processing. The classes are purposefully kept fairly generic in order to maximize re-use in different contexts ranging from generative design, animation, interaction/interface design, data visualization to architecture and digital fabrication, use as teaching tool and more.

programming language

Languages of special interest, i.e. that your author finds shiny.


A little language for machines with Speech Acts inspired by Elephant 2000. The parser uses the wonderful Clojure Instaparse library. The language aims to have syntactically sugared "speech acts" that the machine uses as inputs and outputs. The language also supports beliefs and goals from McCarthy’s paper, Ascribing Mental Qualities to Machines.


Bloom is a programming language for the cloud and other distributed computing systems. BOOM is the research project at UC Berkeley that is developing Bloom, as part of a larger agenda to make it easy to build distributed software systems. Bloom removes traditional mismatches between distributed software and platforms, enabling powerful coding and code analysis without resorting to exotic syntax.

Bloom was designed to match–and exploit–the disorderly reality of distributed systems. Bloom programmers write programs made up of unordered collections of statements, and are given constructs to impose order when needed. The standard data structures in Bloom are disorderly collections, rather than scalar variables and structures. These data structures reflect the realities of non-deterministic ordering inherent in distributed systems. Bloom provides simple, familiar syntax for manipulating these structures.

D Language

The D programming language is an object-oriented, imperative, multi-paradigm system programming language. hough it originated as a re-engineering of C, D is a distinct language, having redesigned some core C features while also taking inspiration from other languages, notably Java, Python, Ruby, C#, and Eiffel. D’s design goals attempt to combine the performance and safety of compiled languages with the expressive power of modern dynamic languages. Idiomatic D code is commonly as fast as equivalent C++ code, while being shorter and memory safe. Type inference, automatic memory management and syntactic sugar for common types allow faster development, while bounds checking, design by contract features and a concurrency-aware type system help reduce the occurrence of bugs.

D supports five main programming paradigms—imperative, object-oriented, metaprogramming, functional and concurrent (Actor model). C’s application binary interface (ABI) is supported as well as all of C’s fundamental and derived types, enabling direct access to existing C code and libraries. D bindings are available for many popular C libraries. C’s standard library is part of standard D.


DRAKON is a visual language for specifications from the Russian space program. DRAKON is used for capturing requirements and building software that controls spacecraft. The rules of DRAKON are optimized to ensure easy understanding by human beings.

DRAKON Editor is a free tool for authoring DRAKON flowcharts. It also supports sequence diagrams, entity-relationship and class diagrams. The user interface of DRAKON Editor is extremely simple and straightforward. Software developers can build real programs with DRAKON Editor. Source code can be generated in several programming languages, including Java,, D, C#, C/C++ (with Qt support), Python, Tcl, Javascript, Lua, Erlang, AutoHotkey and Verilog.


Eon is the first energy-aware programming language. It is a declarative coordination language and runtime system designed to simplify the development of perpetual systems by separating program logic from energy management. Using Eon, the system designer describes program operation as well as how the program can be adjusted in order to conserve energy. During operation the Eon runtime system automatically adjust the program in order to sustain operation based on online measurements of energy harvest and per-task energy consumption.


A language for connecting technologies using pure metaphors.


Esterel is a programming language dedicated to control-dominated reactive systems, such as control circuits, embedded systems, human-machine interface, or communication protocols.


The Factor programming language combines powerful language features with a full-featured library. The implementation is fully compiled for performance, while still supporting interactive development. Factor applications are portable between all common platforms. Factor can deploy stand-alone applications on all platforms.

Factor belongs to the family of concatenative languages: this means that, at the lowest level, a Factor program is a series of words (functions) that manipulate a stack of references to dynamically-typed values. This gives the language a powerful foundation which allows many abstractions and paradigms to be built on top.


Halide is a new programming language designed to make it easier to write high-performance image processing code on modern machines. Its current front end is embedded in C++. Compiler targets include x86/SSE, ARM v7/NEON, CUDA, Native Client, and OpenCL.


HANSEI is the the embedded domain-specific language for probabilistic programming: for writing potentially infinite discrete-distribution models and performing exact inference, importance sampling and inference of inference.

HANSEI is an ordinary OCaml library, with probability distributions represented as ordinary OCaml programs. Delimited continuations let us reify non-deterministic programs as lazy search trees, which we may then traverse, explore, or sample. Thus an inference procedure and a model invoke each other as co-routines. Thanks to the delimited control, deterministic expressions look exactly like ordinary OCaml expressions, and are evaluated as such, without any overhead.


The Heterogeneous Image Processing Acceleration Framework allow the design of image processing kernels and algorithms in a domain-specific language (DSL). From this high-level description, low-level target code for GPU accelerators is generated using source-to-source translation. As back ends, the framework supports CUDA, OpenCL, and Renderscript.

J Language

The J programming language, developed in the early 1990s by Kenneth E. Iverson and Roger Hui, is a synthesis of APL (also by Iverson) and the FP and FL function-level languages created by John Backus. J is a very terse array programming language, and is most suited to mathematical and statistical programming, especially when performing operations on matrices.


Jolie is an open-source[1] programming language for developing distributed applications based on microservices. In the programming paradigm proposed with Jolie, each program is a service that can communicate with other programs by sending and receiving messages over a network. Jolie supports an abstraction layer that allows services to communicate different mediums, ranging from TCP/IP sockets to local in-memory communications between processes. Jolie is currently supported by an interpreter implemented in the Java language, which can be run in multiple operating systems. Since it supports the orchestration of Web Services, Jolie is an alternative to XML-based orchestration languages such as WS-BPEL as it offers a concise (C-like) syntax for accessing XML-like data structures.


Language and compiler for image processing graphs (specific language + C) into a single merged OpenCL kernel tuned for the target many-core architecture.


LLJS is a typed dialect of JavaScript that offers a C-like type system with manual memory management. It compiles to JavaScript and lets you write memory-efficient and GC pause-free code less painfully, in short, LLJS is the bastard child of JavaScript and C. LLJS is early research prototype work, so don’t expect anything rock solid just yet. The research goal here is to explore low-level statically typed features in a high-level dynamically typed language. Think of it as inline assembly in C, or the unsafe keyword in C#. It’s not pretty, but it gets the job done.


MLton is an open source, whole-program optimizing compiler for the Standard ML (SML) programming language. MLton aims to produce fast executables, and to encourage rapid prototyping and modular programming by eliminating performance penalties often associated with the use of high-level language features.

MLton development began in 1997, and continues with a worldwide community of developers and users, who have helped to port MLton to a number of platforms. As a whole-program compiler Mlton is notable amongst SML environments such as Standard ML of New Jersey (SML/NJ) for lacking an interactive top level, common among most SML implementations. MLton also includes several libraries in addition to the SML Basis Library as well as features to aid in porting code from SML/NJ, one of the more popular SML implementations. MLton also aims to make programming in the large more feasible through the use of the MLBasis system simplifying modularity and managing of namespaces in larger pieces of code.


The Mozart Programming System combines ongoing research in programming language design and implementation, constraint logic programming, distributed computing, and human-computer interfaces. Mozart implements the Oz language and provides both expressive power and advanced functionality. Mozart excels in creating distributed, concurrent applications, because it makes a network fully transparent. It supports GUI applications through Tcl/Tk integration, because it runs applications in a virtual machine: applications can be developed once and run on many different platforms.

The PLDC Research Group at UCL is proud to announce the first release of Mozart 2. This release contains a completely redesigned 64-bit virtual machine (compatible with 32-bit and 64-bit processors), and adds an extension interface to the virtual machine to allow language extensions defined within Oz. The PLDC Research Group will use Mozart 2 for future programming education and future research in programming language design and implementation.

The first release of Mozart 2 does not provide support for constraints or distributed programming. We plan for successive releases to support constraint programming with an interface to the Gecode system, and to support distributed programming with a peer-to-peer transactional storage and extensions for network-transparent and synchronization-free programming.


A compiled, garbage-collected systems programming language which has an excellent productivity/performance ratio. Nim’s design focuses on efficiency, expressiveness, elegance (in the order of priority).

Nim (formerly known as "Nimrod") is a statically typed, imperative programming language that tries to give the programmer ultimate power without compromises on runtime efficiency. This means it focuses on compile-time mechanisms in all their various forms.

Beneath a nice infix/indentation based syntax with a powerful (AST based, hygienic) macro system lies a semantic model that supports a soft realtime GC on thread local heaps. Asynchronous message passing is used between threads, so no "stop the world" mechanism is necessary. An unsafe shared memory heap is also provided for the increased efficiency that results from that model.


Oberon is a general-purpose programming language created in 1986 by Professor Niklaus Wirth and the latest member of the Wirthian family of ALGOL-like languages (Euler, Algol-W, Pascal, Modula, and Modula-2). Oberon was the result of a concentrated effort to increase the power of Modula-2, the direct successor of Pascal, and simultaneously to reduce its complexity. Its principal new feature is the concept of type extension of record types:[1] It permits the construction of new data types on the basis of existing ones and to relate them, deviating from the dogma of strictly static data typing.

The new System since 2008 is now called A2. A2 is the name of a modern integrated software environment. It is a single-user, multi-tasking system that runs on bare hardware or on top of a host operating system.


OCaml is the main implementation of the Caml programming language, created by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy and others in 1996. OCaml extends the core Caml language with object-oriented constructs.

OCaml’s toolset includes an interactive top level interpreter, a bytecode compiler, and an optimizing native code compiler. It has a large standard library that makes it useful for many of the same applications as Python or Perl, as well as robust modular and object-oriented programming constructs that make it applicable for large-scale software engineering. OCaml is the successor to Caml Light. The acronym CAML originally stood for Categorical Abstract Machine Language, although OCaml abandons this abstract machine.[1]

OCaml is a free open source project managed and principally maintained by INRIA. In recent years, many new languages have drawn elements from OCaml, most notably Fsharp and Scala.


Pharo is an open source implementation of the programming language and environment Smalltalk. Pharo offers strong live programming features such as immediate object manipulation, live update, hot recompilation. Live programming environment is in the heart of the system. Pharo also supports advanced web development with frameworks such as Seaside and more recently Tide.


Seaside provides a layered set of abstractions over HTTP and HTML that let you build highly interactive web applications quickly, reusably and maintainably. It is based on Smalltalk.


Picat is a simple, and yet powerful, logic-based multi-paradigm programming language aimed for general-purpose applications. Picat is a rule-based language, in which predicates, functions, and actors are defined with pattern-matching rules. Picat incorporates many declarative language features for better productivity of software development, including explicit non-determinism, explicit unification, functions, list comprehensions, constraints, and tabling. Picat also provides imperative language constructs, such as assignments and loops, for programming everyday things. The Picat implementation, which is based on a well-designed virtual machine and incorporates a memory manager that garbage-collects and expands the stacks and data areas when needed, is efficient and scalable. Picat can be used for not only symbolic computations, which is a traditional application domain of declarative languages, but also for scripting and modeling tasks.


Push is a programming language designed for evolutionary computation, to be used as the programming language within which evolving programs are expressed.


A programming language designed to serve as an outstanding choice for programming education while exploring the confluence of scripting and functional programming. Pyret has Python-inspired syntax for functions, lists, and operators. Iteration constructs are designed to be evocative of those in other languages.


A full-spectrum programming language. It goes beyond Lisp and Scheme with dialects that support objects, types, laziness, and more. Racket enables programmers to link components written in different dialects, and it empowers programmers to create new, project-specific dialects. Racket’s libraries support applications from web servers and databases to GUIs and charts.

Creating Languages in Racket -


A program that synthesizes floating-point programs from real-number programs, automatically handling simple numerical instabilities. Herbie can solve many simple and not-so-simple problems. It can improve the accuracy of many real-world programs, and successfully solves most problems from Richard Hamming’s Numerical Methods for Scientists and Engineers, Chapter 3. Installing Herbie requires Racket.


A book-publishing system written in Racket. If you think documents should be programmable, you’ll love it. If not, you can move along. Pollen gives you access to a full programming language (Racket) with a text-based syntax that makes it easy to embed code within your documents.


Scribble is a collection of tools for creating prose documents—papers, books, library documentation, etc.—in HTML or PDF (via Latex) form. More generally, Scribble helps you write programs that are rich in textual content, whether the content is prose to be typeset or any other form of text to be generated programmatically.


SAC (Single Assignment C) is a strict purely functional programming language whose design is focussed on the needs of numerical applications. Particular emphasis is laid on efficient support for array processing. Efficiency concerns are essentially twofold. On the one hand, efficiency in program development is to be improved by the opportunity to specify array operations on a high level of abstraction. On the other hand, efficiency in program execution, i.e. the runtime performance of programs both in time and memory consumption, is still to be achieved by sophisticated compilation schemes. Only as far as the latter succeeds, the high-level style of specifications can actually be called useful.

In order to overcome the acceptance problems encountered by other functional or array based languages intended for numerical / array intensive applications, e.g. Sisal, Nesl, Nial, APL, J, or K, particular regard is paid to ease the transition from a C / Fortran like programming environment to SAC.


Shen is a portable functional programming language that offers pattern matching, lambda calculus consistency, macros, optional lazy evaluation, static type checking, an integrated fully functional Prolog, and an inbuilt compiler-compiler.

Shen has one of the most powerful type systems within functional programming. Shen runs under a reduced instruction Lisp and is designed for portability. The word ‘Shen’ is Chinese for spirit and our motto reflects our desire to liberate our work to live under many platforms. Shen is under BSD and currently runs under CLisp and SBCL, Clojure, Scheme, Ruby, Python, the JVM and Javascript.


Twelf is a language used to specify, implement, and prove properties of deductive systems such as programming languages and logics. Twelf is a piece of computer software, and it is also a computer language understood by the Twelf software. C code and Java code describe programs, HTML code describes graphical web pages, and Twelf code describes logical systems.

The reason someone might want to use Twelf code to describe a logical system is that once they’ve described it, they can write more Twelf code that uses that logical system. You could use Twelf to write out a statement about basic arithmetic (for instance, “if a + b = c, then b + a = c”), and then use Twelf to write out a justification of why that statement is true (i.e. a proof). When you do so, Twelf will check your proof, making sure that what you said actually is true!

It turns out that while basic arithmetic, set theory, and interesting logics are logical systems, programming languages are also logical systems - and Twelf has a couple of unique features that make it a great tool to use when the logical systems you are working with are programming languages.


We describe an implementation to solve Poissonʼs equation for an isolated system on a unigrid mesh using FFTs. The method solves the equation globally on mesh blocks distributed across multiple processes on a distributed-memory parallel computer. Test results to demonstrate the convergence and scaling properties of the implementation are presented. The solver is offered to interested users as the library PSPFFT.


The Parallel Ultra-Light Systolic Array Runtime (PULSAR), now in version 2.0, is a complete programming platform for large-scale distributed memory systems with multicore processors and hardware accelerators. PULSAR provides a simple abstraction layer over multithreading, message-passing, and multi-GPU, multi-stream programming. PULSAR offers a general-purpose programming model, suitable for a wide range of scientific and engineering applications.

This simple programming model allows the user to define the computation in the form of a Virtual Systolic Array (VSA), which is a set of Virtual Data Processors (VDPs), and is connected with data channels. This programming model is also accessible to the user through a very small and simple Application Programming Interface (API), and all the complexity of executing the workload on a large-scale system is hidden in the runtime implementation.

The runtime supports distributed memory systems with multicore processors and relies on POSIX Threads (a.k.a. Pthreads) for intra-node multithreading, and on the Message Passing Interface (MPI) for inter-node communication. The runtime also supports multiple Nvidia GPU accelerators, in each distributed memory node, using the Compute Unified Device Architecture (CUDA) platform.

Pure Data

Pure Data (aka Pd) is an open source visual programming language. Pd enables musicians, visual artists, performers, researchers, and developers to create software graphically, without writing lines of code. Pd is used to process and generate sound, video, 2D/3D graphics, and interface sensors, input devices, and MIDI. Pd can easily work over local and remote networks to integrate wearable technology, motor systems, lighting rigs, and other equipment. Pd is suitable for learning basic multimedia processing and visual programming methods as well as for realizing complex systems for large-scale projects.

Pd is a so-called data flow programming language, where software called patches are developed graphically. Algorithmic functions are represented by objects, placed on a screen called canvas. Objects are connected together with cords, and data flows from one object to another through this cords. Each object performs a specific task, from very low level mathematic operations to complex audio or video functions such as reverberation, fft transform, or video decoding.


Linux-centric monolithic distribution based on pd-extended with focus on solid/stable core, enhancements, and usability features including infinite undo, gui-based iemgui object editing, accelerated visual editor and gui operations, improved appearance, K12 education mode, and more. The distribution is developed for and maintained by Virginia Tech’s Linux Laptop Orchestra (L2Ork).


Pushpin is a new way to build realtime HTTP and WebSocket services.


A collection of standard atmospheric and oceanic sciences routines.


The Python ARM Radar Toolkit, Py-ART, is an open source Python module containing a growing collection of weather radar algorithms and utilities build on top of the Scientific Python stack and distributed under the 3-Clause BSD license. Py-ART is used by the Atmospheric Radiation Measurement (ARM) Climate Research Facility for working with data from a number of precipitation and cloud radars, but has been designed so that it can be used by others in the radar and atmospheric communities to examine, processes, and analyse data from many types of weather radars.


A library which follows the Python/C API as closely as possible, while providing equivalent functionality for objective caml. This is built against python 2.x and Ocaml 3.04.

It is intended to allow users to build native ocaml libraries and use them from python, and alternately, in order to allow ocaml users to benefit from linkable libraries provided for python.


pyDatalog adds the logic programming paradigm to Python’s extensive toolbox, in a pythonic way.

Logic programmers can now use the extensive standard library of Python, and Python programmers can now express complex algorithms quickly.

Datalog is a truly declarative language derived from Prolog, with strong academic foundations. Datalog excels at managing complexity. Datalog programs are shorter than their Python equivalent, and Datalog statements can be specified in any order, as simply as formula in a spreadsheet.


Pydgin provides a collection of classes and functions which act as an embedded architectural description language (embedded-ADL) for concisely describing the behavior of instruction set simulators (ISS). An ISS described in Pydgin can be directly executed in a Python interpreter for rapid prototyping and debugging, or alternatively can be used to automatically generate a performant, JIT-optimizing C executable more suitable for application development.

Automatic generation of JIT-enabled ISS from Pydgin is enabled by the RPython Translation Toolchain, an open-source tool used by developers of the PyPy JIT-optimizing Python interpreter.

An ISS described in Pydgin implements an interpretive simulator which can be directly executed in a Python interpreter for rapid prototyping and debugging. However, Pydgin ISS can also be automatically translated into a C executable implementing a JIT-enabled interpretive simulator, providing a high-performance implementation suitable for application development. Generated Pydgin executables provide significant performance benefits in two ways. First, the compiled C implementation enables much more efficient execution of instruction-by-instruction interpretive simulation than the original Python implementation. Second, the generated executable provides a trace-JIT to dynamically compile frequently interpreted hot loops into optimized assembly.


PyDom is a Python package which implements various diagnostics for NEMO model output.


PyFR is an open-source Python based framework for solving advection-diffusion type problems on streaming architectures using the Flux Reconstruction approach of Huynh. The framework is designed to solve a range of governing systems on mixed unstructured grids containing various element types. It is also designed to target a range of hardware platforms via use of an in-built domain specific language derived from the Mako templating engine.


A cross-platform windowing and multimedia library for Python. Pyglet provides an object-oriented programming interface for developing games and other visually-rich applications for Windows, Mac OS X and Linux. Features include:

  • No external dependencies or installation requirements. For most application and game requirements, pyglet needs nothing else besides Python, simplifying distribution and installation.

  • Take advantage of multiple windows and multi-monitor desktops. pyglet allows you to use as many windows as you need, and is fully aware of multi-monitor setups for use with fullscreen games.

  • Load images, sound, music and video in almost any format. pyglet can optionally use AVbin to play back audio formats such as MP3, OGG/Vorbis and WMA, and video formats such as DivX, MPEG-2, H.264, WMV and Xvid.


PyGeoIf provides a GeoJSON-like protocol for geo-spatial (GIS) vector data. When you want to write your own geospatilal library with support for this protocol you may use pygeoif as a starting point and build your functionality on top of it.

You may think of pygeoif as a shapely ultralight which lets you construct geometries and perform very basic operations like reading and writing geometries from/to WKT, constructing line strings out of points, polygons from linear rings, multi polygons from polygons, etc. It was inspired by shapely and implements the geometries in a way that when you are familiar with shapely you feel right at home with pygeoif. It was written to provide clean and python only geometries for fastkml.


Utilities for applying scikit-learn to spatial datasets.


The Kepler archive contains time-series data that have been calibrated and reduced from detector pixels. This pipelined reduction includes the removal of time-series trends systematic to the spacecraft and its environment rather than the targets. For every target there is a level of subjectivity required to reduce systematics. Differing scientific goals are likely to have differing requirements for systematic mitigation. Systematic reduction in the Kepler pipeline is optimized to yield the highest number of potentially-detectable exoplanet transits from a sample of 200,000 stars. PyKE, on the other hand, is a group of python tasks developed for the reduction and analysis of Kepler pixel-level data and Simple Aperture Photometry (SAP) data of individual targets with individual characteristics. PyKE was developed to provide alternative data reduction, tunable to the user’s specific science goals. The main purposes of these tasks are to i) re-extract light curves from manually-chosen pixel apertures and ii) cotrend and/or detrend the data in order to reduce or remove systematic noise structure using methods tun-able to user and target-specific requirements. Tasks to perform data analysis developed for the author’s science programs are also included. PyKE is an open source project. Contributions of new tasks or enhanced functionality of existing tasks by the community are welcome.

PyKE is a python-based PyRAF package which can also be executed without PyRAF on the command line of a shell.


pyKML is a Python package for creating, parsing, manipulating, and validating KML, a language for encoding and annotating geographic data.

pyKML is based on the lxml.objectify API which provides a Pythonic API for working with XML documents. pyKML adds additional functionality specific to the KML language.

KML comes in several flavors. pyKML can be used with KML documents that follow the base OGC KML specification, the Google Extensions Namespace, or a user-supplied extension to the base KML specification (defined by an XML Schema document).


Pynamic is a benchmark designed to test a system’s ability to handle the Dynamic Linking and Loading requirements of Python-based scientific applications. We developed this benchmark to represent a newly emerging class of DLL behaviors. Pynamic builds on pyMPI, an MPI extension to Python. Our augmentation includes a code generator that automatically generates Python C-extension dummy codes and a glue layer that facilitates linking and loading of the generated dynamic modules into the resulting pyMPI. Pynamic is configurable, enabling it to model the static properties of a specific code. It does not, however, model any significant computations of the target and hence it is not subjected to the same level of control as the target code. In fact, we encourage HPC computer vendors and tool developers to add it to their test suites. This benchmark provides an effective test of the compiler, the linker, the loader, the OS kernel and other runtime systems of a high performance computing (HPC) system to handle an important aspect of modern scientific computing applications. In addition, the benchmark serves as a stress test case for code development tools. Although Python has recently gained popularity in the HPC community, its heavy use of DLL operations has hindered certain HPC code development tools, notably parallel debuggers, from performing optimally.

The heart of Pynamic is a Python script that generates C files and compiles them into shared object libraries. Each library contains a Python callable entry function as well as a number of utility functions. The user can also enable cross library function calls with a command line argument. The Pynamic configure script then links these libraries into the pynamic-pyMPI executable and creates a driver script to exercise the functions in the generated libraries. The user can specify the number of libraries to create, as well as the average number of utility functions per library, thus tailoring the benchmark to match some application of interest. Pynamic introduces randomness in the number of functions per module and the function signatures, thus ensuring some heterogeneity of the libraries and functions.


A Python package that allows read and/or write access to a variety of data formats using an interface modeled on netCDF. PyNIO is composed of a C library called libnio along with a Python module based on and with an interface similar to the Scientific.IO.NetCDF module written by Konrad Hinsen. The C library contains the same data I/O code used in NCL, a scripting language developed for analysis and visualization of geo-scientific data.


A library for drawing, labeling, patterning and manipulating particles in 2d images. Pyparty was built on top of scikit-image. It emerged as a means to abstract the concept of particles (i.e. image blobs) into custom data structures for intuitive manipulation and characterization whilst preserving the image API. In addition to integrating new particle constructs with the existing array and image processing functions, pyparty extends scikit-image’s rasterization toolbox with new particle types, patterning, and an interface to Matplotlib patch objects for vectorized particle renderings. Thus, pyparty leverages the conventional imaging pipeline at both ends; it provides a tool set for artificial image composition, and a framework for particle post-processing.

pyparty: Intuitive Particle Processing in Python (online article) -


A Python-based library for automating the management and online publication of scientific software and data. PyRDM aims to automate the process of sharing the software and data via online, citable repositories such as Figshare.

PyRDM provides the capability to integrate research data management into the workflows of scientific software packages, so that the software and data can be curated at the push of a button. The functionality of PyRDM is encapsulated in several Python modules which, in short, facilitate the automated publication of scientific software and data via online, citable repositories. Specifically, the Figshare online repository service is used to store the software and data, and provides DOIs for both resources so they can be appended to any provenance metadata. Not only does this allow a specific version of the software or data to be properly cited in a journal publication, it can also reduce data duplication by enabling the sharing of data with colleagues around the world.

PyRDM: A Python-based library for automating the management and online publication of scientific software and data (online article) -


Various implementations of the language standard known as Python.


An implementation of the Python programming language which is designed to run on the Java(tm) Platform. It consists of a compiler to compile Python source code down to Java bytecodes which can run directly on a JVM, a set of support libraries which are used by the compiled Java bytecodes, and extra support to make it trivial to use Java packages from within Jython.

Jython is an implementation of the Python language for the Java platform. Jython 2.5 implements the same language as CPython 2.5, and nearly all of the Core Python standard library modules. (CPython is the C implementation of the Python language.)


PyPy is a replacement for CPython. It is built using the RPython language that was co-developed with it. The main reason to use it instead of CPython is speed: it runs generally faster. PyPy 2.5 implements Python 2.7.8 and runs on Intel x86 (IA-32) , x86_64 and ARM platforms. It supports all of the core language, passing the Python test suite (with minor modifications that were already accepted in the main python in newer versions). It supports most of the commonly used Python standard library modules.

Our main executable comes with a Just-in-Time compiler. It is really fast in running most benchmarks – including very large and complicated Python applications, not just 10-liners. The case where PyPy works best is when executing long-running programs where a significant fraction of the time is spent executing Python code.


Stackless Python is an enhanced version of the Python programming language. It allows programmers to reap the benefits of thread-based programming without the performance and complexity problems associated with conventional threads. The microthreads that Stackless adds to Python are a cheap and lightweight convenience.

Stackless provides the tools to model concurrency more easily than you can currently do in most conventional languages. With stackless, you get concurrency in addition to all of the advantages of python itself, in an environment that you are (hopefully) already familiar with.


A Python remote procedure call framework that uses JSON RPC v2.0. Python-JRPC allows programmers to create powerful client/server programs with very little code.

the FastHCS algorithm, we carry out an extensive simulation study and four real data applications, the results of which show that FastHCS is systematically more robust to outliers than its competitors.


A Python to C++ compiler for a subset of the Python language. It takes a python module annotated with a few interface description and turns it into a native python module with the same interface, but (hopefully) faster. It is meant to efficiently compile scientific programs, and takes advantage of multi-cores and SIMD instruction units.


The Pythonic unified complex network and recurrence analysis toolbox is an open source software package for applying and combining modern methods of data analysis and modeling from complex network theory and nonlinear time series analysis. pyunicorn is a fully object-oriented and easily parallelizable package written in the language Python. It allows for the construction of functional networks such as climate networks in climatology or functional brain networks in neuroscience representing the structure of statistical interrelationships in large data sets of time series and, subsequently, investigating this structure using advanced methods of complex network theory such as measures and models for spatial networks, networks of interacting networks, node-weighted statistics or network surrogates. Additionally, pyunicorn provides insights into the nonlinear dynamics of complex systems as recorded in uni- and multivariate time series from a non-traditional perspective by means of recurrence quantification analysis (RQA), recurrence networks, visibility graphs and construction of surrogate time series.



A cross-platform free and open-source desktop geographic information system (GIS) application that provides data viewing, editing, and analysis capabilities. Similar to other software GIS systems QGIS allows users to create maps with many layers using different map projections. Maps can be assembled in different formats and for different uses. QGIS allows maps to be composed of raster or vector layers. Typical for this kind of software the vector data is stored as either point, line, or polygon-feature. Different kinds of raster images are supported and the software can perform georeferencing of images.

QGIS provides integration with other open source GIS packages, including PostGIS, GRASS, and MapServer to give users extensive functionality.[2] Plugins, written in Python or C++, extend the capabilities of QGIS. There are plugins to geocode using the Google Geocoding API, perform geoprocessing (fTools) similar to the standard tools found in ArcGIS, interface with PostgreSQL/PostGIS, SpatiaLite and MySQL databases.


Python bindings for QGIS that depend on SIP and PyQt4.


The single-baroclinic mode Neelin-Zeng Quasi-Equilibrium Tropical Circulation Model (QTCM1) is a primitive equation-based intermediate-level atmospheric model that focuses on simulating the tropical atmosphere. The qtcm package is an implementation of the Neelin-Zeng QTCM1 in a Python object-oriented environment using the f2py Fortran wrapping tool. The result is a modeling package where order and choice of subroutine execution can be altered at runtime, model analysis and visualization can also be integrated with model execution at runtime, while retaining the computationally light footprint of the original intermediate-level model.

R Language

A free software environment for statistical computing and graphics.

Revolution R

An enhanced distribution of R from Revolution Analytics. RRO 3.2.1 is based on version 3.2.1 of the statistical software R and includes additional capabilities for improved performance, reproducibility and platform support.


Exploratory data analysis methods such as principal component methods and clustering.


Perform factorial analysis with a menu and draw graphs interactively thanks to FactoMineR and a Shiny application.


R package implementing multitaper spectral estimation techniques used in time series analysis. This version may be slightly more updated than the one on CRAN.


This package provides a framework to perform Non-negative Matrix Factorization (NMF). It implements a set of already published algorithms and seeding methods, and provides a framework to test, develop and plug new/custom algorithms. Most of the built-in algorithms have been optimized in C++, and the main interface function provides an easy way of performing parallel computations on multicore machines.


Supports the analysis of Oceanographic data, including ADP measurements, CTD measurements, sectional data, sea-level time series, coastline files, etc. Provides functions for calculating seawater properties such as potential temperature and density, as well as derived properties such as buoyancy frequency and dynamic height.


Functions for transforming and viewing 2-D and 3-D (oceanographic) data and model output.


OpenCPU is a system for embedded scientific computing and reproducible research. The OpenCPU server provides a reliable and interoperable HTTP API for data analysis based on R. You can either use the public servers or host your own. The OpenCPU JavaScript client library provides the most seamless integration of R and JavaScript available today. Enjoy simple RPC and data I/O through standard Ajax techniques. No need to learn crazy widgets or obscure framworks. The OpenCPU API is a clean and simple interface to R, nothing more nothing less. It is compatible with any language or framework that speaks HTTP.


A suite of functions for converting sp-class objects into KML or KMZ documents for use in Google Earth. Visualization of spatial and spatio-temporal objects in Google Earth


A multiway method to decompose a tensor (array) of any order, as a generalisation of SVD also supporting non-identity metrics and penalisations. 2-way SVD with these extensions is also available. The package includes also some other multiway methods: PCAn (Tucker-n) and PARAFAC/CANDECOMP with these extensions.


This package builds on the EMD package to provide additional tools for empirical mode decomposition (EMD) and Hilbert spectral analysis. It also implements the ensemble empirical decomposition (EEMD) and the complete ensemble empirical mode decomposition (CEEMD) methods to avoid mode mixing and intermittency problems found in EMD analysis. The package comes with several plotting methods that can be used to view intrinsic mode functions, the HHT spectrum, and the Fourier spectrum. To see the version history and download the bleeding-edge version (at your own risk!), see the project website at below. See the other links for PDF files describing numerical and exact analytical methods for determining instantaneous frequency, some examples of signals processed with this package, and some examples of the ensemble empirical mode decomposition method.


An R interface for C library libeemd for performing the ensemble empirical mode decomposition (EEMD), its complete variant (CEEMDAN) or the regular empirical mode decomposition (EMD).


Rserve is a TCP/IP server which allows other programs to use facilities of R (see from various languages without the need to initialize R or link against R library. Every connection has a separate workspace and working directory. Client-side implementations are available for popular languages such as C/C++, PHP and Java. Rserve supports remote connection, authentication and file transfer. Typical use is to integrate R backend for computation of statstical models, plots etc. in other applications.


Shiny makes it incredibly easy to build interactive web applications with R. Automatic "reactive" binding between inputs and outputs and extensive pre-built widgets make it possible to build beautiful, responsive, and powerful applications with minimal effort.


Discrete Prolate Spheroidal Sequence (Slepian) Regression Smoothers.


Package for discrete Morse-Smale complex approximation based on kNN graph. The Morse-Smale complex provides a decomposition of the domain. This package provides methods to compute a hierarchical sequence of Morse-Smale complicies and tools that exploit this domain decomposition for regression and visualization of scalar functions.


This package provides regularized principal component analysis incorporating smoothness, sparseness and orthogonality of eigenfunctions by using alternating direction method of multipliers (ADMM) algorithm.


This package contains a set of measures of dissimilarity between time series to perform time series clustering. Metrics based on raw data, on generating models and on the forecast behavior are implemented. Some additional utilities related to time series clustering are also provided, such as clustering algorithms and cluster evaluation metrics.


The W2CWM2C package is a set of functions to produce new graphical tools for wavelet correlation (bivariate and multivariate cases) using some routines from the waveslim and wavemulcor packages.


Wavelet analysis and reconstruction of time series, cross-wavelets and phase-difference (with filtering options), significance with simulation algorithms.


Basic wavelet routines for time series (1D), image (2D) and array (3D) analysis. The code provided here is based on wavelet methodology developed in Percival and Walden (2000); Gencay, Selcuk and Whitcher (2001); the dual-tree complex wavelet transform (DTCWT) from Kingsbury (1999, 2001) as implemented by Selesnick; and Hilbert wavelet pairs (Selesnick 2001, 2002).


Methods to calculate and interpret climate change signals and time series from climate multi-model ensembles. Climate model output in binary NetCDF format is read in and aggregated over a specified region to a data.frame for statistical analysis. Global circulation models (GCMs), as the CMIP5 or CMIP3 simulations, can be read in the same way as Regional Climate Models (RCMs), as e.g. the CORDEX or ENSEMBLES simulations.


Simulation and Inference for Stochastic Differential Equations. The YUIMA Project is an open source and collaborative effort aimed at developing the R package yuima for simulation and inference of stochastic differential equations. In the yuima package stochastic differential equations can be of very abstract type, multidimensional, driven by Wiener process or fractional Brownian motion with general Hurst parameter, with or without jumps specified as Lévy noise. The yuima package is intended to offer the basic infrastructure on which complex models and inference procedures can be built on. This paper explains the design of the yuima package and provides some examples of applications.


The Rapid Python Deep Learning Infrastructure (RaPyDLI) project is based on the objective to combine high level Python, C/C++ and Java environments with carefully designed libraries supporting GPU accelerators and MIC coprocessors. Interactive analysis and visualization will be supported together with scaling from the current terabyte size to Petabyte datasets to enable substantial progress in the complexity and capability of the DL applications. A broad range of storage models will be supported including network file systems, databases and HDFS. The partnership of Indiana University, University of Tennessee Knoxville, and Stanford University combines leaders in parallel computing algorithms and run times, Big Data, clouds, and deep learning.


Array Databases allow storing and querying massive multi-dimensional arrays, such as sensor, image, simulation, and statistics data appearing in domains like earth, space, and life science.

The rasdaman ("raster data manager") is the leading array analytics engine distinguished by its flexibility, performance, and scalability. Rasdaman embeds itself smoothly into PostgreSQL, but can also run standalone on file systems. In fact, rasdaman has pioneered Array Databases being the first fully implemented, operationally used system with an array query language and optimized processing engine; known rasdaman databases exceed 230 TB.


The petascope component of rasdaman implements the OGC interface standards WCS 2.0, WCS-T 1.4, WCPS 1.0, WPS 1.0, and WMS 1.1. For this purpose, petascope maintains its additional metadata (such as georeferencing) which is kept in separate relational tables. Note that not all rasdaman raster objects and collections are available through petascope by default; rather, they need to be registered through the petascope administration interface.

Petascope is implemented as a war file of servlets which give access to coverages (in the OGC sense) stored in rasdaman. Internally, incoming requests requiring coverage evaluation are translated into rasql queries by petascope. These queries are passed on to rasdaman, which constitutes the central workhorse. Results returned from rasdaman are forwarded to the client, finally.

Raspberry Pi

Measuring Temperature with RethinkDB and Raspberry Pi -


A fully functioning command-line Dropbox client built using the Dropbox Python API for Raspberry Pi. Features include downloading entire directories as zip files, editing files in a local editor, uploading/downloading, and many many more.


A free operating system based on Debian optimized for the Raspberry Pi hardware. An operating system is the set of basic programs and utilities that make your Raspberry Pi run. However, Raspbian provides more than a pure OS: it comes with over 35,000 packages, pre-compiled software bundled in a nice format for easy installation on your Raspberry Pi.


Clean and fast and geospatial raster I/O for Python programmers who use Numpy. Rasterio employs GDAL under the hood for file I/O and raster formatting. Its functions typically accept and return Numpy ndarrays. Rasterio is designed to make working with geospatial raster data more productive and more fun.

Raster Numpy Basics

IPython notebook tutorial on nbviewer.


A MATLAB implementation of the RBF-QR method for radial basis function interpolation in the small shape parameter range.


Recki-CT is a set of tools that implement a PHP compiler, in PHP. It doesn’t provide a VM, so it can’t run PHP by itself. However, it can parse PHP code and generate other code from it. Recki uses the well-known PHP-Parser library to generate a graph-based representation of the code, and convert it to an intermediate representation. This intermediate form is pretty low-level, and it is comparatively simple to generate code from it for a variety of targets. One of the targets Recki can use is a second component, JitFu, which is a PHP extension allowing us to generate machine code at run time.


JIT-Fu is a PHP extension that exposes an OO API for the creation of native instructions to PHP userland, using libjit.


LibJIT is a library that provides generic Just-In-Time compiler functionality independent of any particular bytecode, language, or runtime. The goal of the libjit project is to provide an extensive set of routines that takes care of the bulk of the JIT process, without tying the programmer down with language specifics. Where we provide support for common object models, we do so strictly in add-on libraries, not as part of the core code.


Redis is an open source, BSD licensed, advanced key-value cache and store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets, sorted sets, bitmaps and hyperloglogs.


A library that implements non-replicated sharding for redis. It implements a custom routing system on top of python redis that allows you to automatically target different servers without having to manually route requests to the individual nodes.


RLaB is an interactive, interpreted scientific programming environment for fast numerical prototyping and program development. rlabplus provides the third release of the environment for 32- and 64-bit linux systems on Intel and ARM/RaspberryPi architectures. The environment integrates large number of numerical solvers and functions from various sources, most notably from the Gnu Scientific Library (GSL) and from the netlib. Within the environment it is possible to visualize data using gnuplot, xmgrace, and pgplot xor plplot; get and post data using uniform resource locator implementing HDF5 or world wide web; and control serial, GPIB or TCP/IP connection. RLaB supports embedded python, java and ngspice interpreters. RLaB was created by Ian Searle and collaborators. rlabplus is being actively developed by Marijan Kostrun.


A vision of heterogeneous computer systems that incorporate diverse accelerators and automatically select the best computational unit for a particular task is widely shared among researchers and many industry analysts; however, there are no agreed-upon benchmarks to support the research needed in the development of such a platform. There are many suites for parallel computing on general-purpose CPU architectures, but accelerators fall into a gap that is not covered by previous benchmark development. Rodinia is released to address this concern.[]


The Robot Operating System (ROS) is a flexible framework for writing robot software. It is a collection of tools, libraries, and conventions that aim to simplify the task of creating complex and robust robot behavior across a wide variety of robotic platforms.


This professional scientific software computes recurrence plots, cross recurrence plots, joint recurrence plots and recurrence quantification analysis on commandline of Unix and DOS/DOS-emulated systems. It is able to work with really long data series. However, the output of the results (plots) have to be prepared with external programmes (e.g. gnuplot or Matlab).

The state space trajectory can be reconstructed from single time-series by time-delay embedding. Alternatively, the columns of input data can be used as the components of the state space vectors.


RPerl is an upgrade to the popular Perl 5 programming language. RPerl gives software developers a compiler to make their apps run really fast on parallel computing platforms like multi-core processors, the cloud, clusters, and supercomputers. RPerl stands for Restricted Perl, in that we restrict our use of Perl to those parts which can be made to run fast.

The input to the RPerl compiler is low-magic Perl 5 source code. RPerl converts the low-magic Perl 5 source code into C source code using Perl and/or C data structures. Inline::CPP converts the C source code into XS source code. Perl’s XS tools and a standard C compiler convert the XS source code into machine-readable binary code, which can be directly linked back into normal high-magic Perl 5 source code.

The output of the RPerl compiler is fast-running binary code that is exactly equivalent to, and compatible with, the original low-magic Perl 5 source code input. The net effect is that RPerl compiles slow low-magic Perl 5 code into fast binary code, which can optionally be mixed back into high-magic Perl apps.


A translation and support framework for producing implementations of dynamic languages, emphasizing a clean separation between language specification and implementation aspects. By separating concerns in this way, our implementation of Python - and other dynamic languages - is able to automatically generate a Just-in-Time compiler for any dynamic language. It also allows a mix-and-match approach to implementation decisions, including many that have historically been outside of a user’s control, such as target platform, memory and threading models, garbage collection strategies, and optimizations applied, including whether or not to have a JIT in the first place.


This document describes an implementation in C of a set of randomized algorithms for computing partial Singular Value Decompositions (SVDs). The techniques largely follow the prescriptions in the article "Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions," N. Halko, P.G. Martinsson, J. Tropp, SIAM Review, 53(2), 2011, pp. 217-288, but with some modifications to improve performance. The codes implement a number of low rank SVD computing routines for three different sets of hardware: (1) single core CPU, (2) multi core CPU, and (3) massively multicore GPU.


RTL-SDR is a very cheap software defined radio that uses a DVB-T TV tuner dongle based on the RTL2832U chipset. With the combined efforts of Antti Palosaari, Eric Fry and Osmocom it was found that the signal I/Q data could be accessed directly, which allowed the DVB-T TV tuner to be converted into a wideband software defined radio via a new software driver.

Essentially, this means that a cheap $20 TV tuner USB dongle with the RTL2832U chip can be used as a computer based radio scanner. This sort of scanner capability would have cost hundreds or even thousands just a few years ago. The RTL-SDR is also often referred to as RTL2832U, DVB-T SDR, or the “$20 Software Defined Radio”.


A new cross platform SDR receiver which is based on the liquid-dsp libraries.


OsmoSDR is a small form-factor inexpensive SDR (Software Defined Radio) project. If you are familiar with existing SDR receivers, then OsmoSDR can be thought of something in between a FunCube Dongle (only 96kHz bandwidth) and a USRP (much more expensive). For a very cheap SDR (with limited dynamic range), you can use the DVB-T USB stick using the RTL2832U chip, as documented in rtl-sdr. It consists of a USB-attached Hardware, associated Firmware as well as GrOsmoSDR gnuradio integration on the PC.

The gr-osmosdr software is a GNU Radio block to work with OsmoSDR and rtl-sdr, although it also supports at least a dozen other types of hardware.


A digital signal processing (DSP) library designed specifically for software-defined radios on embedded platforms. The aim is to provide a lightweight DSP library that does not rely on a myriad of external dependencies or proprietary and otherwise cumbersome frameworks. All signal processing elements are designed to be flexible, scalable, and dynamic, including filters, filter design, oscillators, modems, synchronizers, and complex mathematical operations.


Software to turn the RTL2832U into a SDR. Much software is available for the RTL2832. Most of the user-level packages rely on the librtlsdr library which comes as part of the rtl-sdr codebase. This codebase contains both the library itself and also a number of command line tools such as rtl_test, rtl_sdr, rtl_tcp, and rtl_fm. These command line tools use the library to test for the existence of RTL2832 devices and to perform basic data transfer functions to and from the device.

At the user level, there are several options for interacting with the hardware. The rtl-sdr codebase contains a basic FM receiver program that operates from the command line. The rtl_fm program is a command line tool that can initialize the RTL2832, tune to a given frequency, and output the received audio to a file or pipe the output to command line audio players such as the alsa aplay or the sox play commands. There is also the rtl_sdr program that will output the raw I-Q data to a file for more basic analysis.

If you want to do more advanced experiments, the GNU Radio collection of tools can be used to build custom radio devices. GNU Radio can be used both from a GUI perspective in which you can drag-and-drop radio components to build a radio and also programmatically where software programs written in C or Python are created that directly reference the internal GNU Radio functions.

The use of GNU Radio is attractive because of the large number of pre-built functions that can easily be connected together. However, be aware that this is a large body of software with dependencies on many libraries. Thankfully there is a simple script that will perform the installation but still, the time required can be on the order of hours. When starting out, it might be good to try the command line programs that come with the rtl-sdr package first and then install the GNU Radio system later.




Prawn is a nimble PDF writer for Ruby. More important, it’s a hackable platform that offers both high level APIs for the most common needs and low level APIs for bending the document model to accomodate special circumstances.

With Prawn, you can write text, draw lines and shapes and place images anywhere on the page and add as much color as you like. In addition, it brings a fluent API and aggressive code re-use to the printable document space.


Reactive extensions to Python.

The Reactive Extensions for Python (RxPY) is a set of libraries for composing asynchronous and event-based programs using observable sequences and LINQ-style query operators in Python. Using Rx, developers represent asynchronous data streams with Observables, query asynchronous data streams using LINQ operators, and parameterize the concurrency in the asynchronous data streams using Schedulers. Simply put, Rx = Observables + LINQ


SageManifolds is a package under development for the modern computer algebra system Sage, implementing differential geometry and tensor calculus.

SageManifolds deals with real differentiable manifolds of arbitrary dimension. The basic objects are tensor fields and not tensor components in a given vector frame or coordinate chart. In other words, various charts and frames can be introduced on the manifold and a given tensor field can have representations in each of them.

An important class of treated manifolds is that of pseudo-Riemannian manifolds, among which Riemannian manifolds and Lorentzian manifolds, with applications to General Relativity. In particular, SageManifolds implements the computation of the Riemann curvature tensor and associated objects (Ricci tensor, Weyl tensor). SageManifolds can also deal with generic affine connections, not necessarily Levi-Civita ones.

The SageManifolds project aims at extending the mathematics software system Sage towards differential geometry and tensor calculus. Like Sage, SageManifolds is free, open-source and is based on the Python programming language. We discuss here some details of the implementation, which relies on Sage’s parent/element framework, and present a concrete example of use.


Sailfish is an open source (LGPL) fluid dynamics solver based on the lattice Boltzmann method (LBM). It is uses run-time code generation techniques to automatically generate optimized, simulation specific code for GPU devices (both CUDA and OpenCL targets are supported). Documentation


The Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory is developing algorithms and software technology to enable the application of structured adaptive mesh refinement (SAMR) to large-scale multi-physics problems relevant to U.S. Department of Energy programs.

SAMRAI (Structured Adaptive Mesh Refinement Application Infrastructure) is an object-oriented C++ software library enables exploration of numerical, algorithmic, parallel computing, and software issues associated with applying structured adaptive mesh refinement (SAMR) technology in large-scale parallel application development. SAMRAI provides software tools for developing SAMR applications that involve coupled physics models, sophisticated numerical solution methods, and which require high-performance parallel computing hardware. SAMRAI enables integration of SAMR technology into existing codes and simplifies the exploration of SAMR methods in new application domains. Due to judicious application of object-oriented design, SAMRAI capabilities are readily enhanced and extended to meet specific problem requirements. The SAMRAI team collaborates with application researchers at LLNL and other institutions. These interactions motivate the continued evolution of the SAMRAI library.


A complete platform of an Open Source Networked Ground Station. The scope of the project is to create a full stack of open technologies based on open standards , and the construction of a full ground station as a showcase of the stack.


An object-functional programming language for general software applications. Scala has full support for functional programming and a very strong static type system. This allows programs written in Scala to be very concise and thus smaller in size than other general-purpose programming languages. Many of Scala’s design decisions were inspired by criticism over the shortcomings of Java.

Scala source code is intended to be compiled to Java bytecode, so that the resulting executable code runs on a Java virtual machine. Java libraries may be used directly in Scala code and vice versa (Language interoperability).[8] Like Java, Scala is object-oriented, and uses a curly-brace syntax reminiscent of the C programming language. Unlike Java, Scala has many features of functional programming languages like Scheme, Standard ML and Haskell, including currying, type inference, immutability, lazy evaluation, and pattern matching. It also has an advanced type system supporting algebraic data types, covariance and contravariance, higher-order types, and anonymous types. Other features of Scala not present in Java include operator overloading, optional parameters, named parameters, raw strings, and no checked exceptions.


Saddle is a data manipulation library for Scala that provides array-backed, indexed, one- and two-dimensional data structures that are judiciously specialized on JVM primitives to avoid the overhead of boxing and unboxing.

Saddle offers vectorized numerical calculations, automatic alignment of data along indices, robustness to missing (N/A) values, and facilities for I/O.

Saddle draws inspiration from several sources, among them the R programming language & statistical environment, the numpy and pandas Python libraries, and the Scala collections library.


A Scala to JavaScript compiler. Scala.js compiles Scala code to JavaScript, allowing you to write your web application entirely in Scala.


The ScalaLab project aims to provide an efficient scientific programming environment for the Java Virtual Machine. The scripting language is based on the Scala programming language enhanced with high level scientific operators and with an integrated environment that provides a MATLAB-like working style. Also, all the huge libraries of Java scientific code can be easily accessible (and many times with a more convenient syntax). The main potential of the ScalaLab is numerical code speed and flexibility. The statically typed Scala language can provide speeds of scripting code similar to pure Java. A major design priority of ScalaLab is its user-friendly interface. We like the user to enjoy writing scientific code, and with this objective we design the whole framework.


A suite of machine learning and numerical computing libraries. ScalaNLP is the umbrella project for several libraries, including Breeze and Epic. Breeze is a set of libraries for machine learning and numerical computing. Epic is a high-performance statistical parser and structured prediction library.


ScMathML is a Scala library for executing Content MathML. Content MathML is a move towards a standard, open format for representing mathematics with relatively well defined semantics. ScMathML takes formulas, and evaluates them in a Context, which provides access to domain objects, constants etc.


A Scala project that harvests sensor data from web sources. The data is then pushed to an SOS using the sos-injection module project. SosInjector is a project that wraps an Sensor Observation Service (SOS). The sos-injection module provides Java classes to enter stations, sensors, and observations into an SOS.

sensor-web-harvester is used to fill an SOS with observations from many well-known sensor sources (such as NOAA and NERRS). This project pulls sensor observation values from the source’s stations. It then formats the data to be placed into the user’s SOS by using the sos-injector. The source stations used are filtered by a chosen bounding box area.


The Scalable LAPACK library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers. It is currently written in a Single-Program-Multiple-Data style using explicit message passing for interprocessor communication. It assumes matrices are laid out in a two-dimensional block cyclic decomposition. ScaLAPACK is designed for heterogeneous computing and is portable on any computer that supports MPI or PVM.


An environment for scientific computation, data analysis and data visualization designed for scientists, engineers and students. The program incorporates many open-source software packages into a coherent interface using the concept of dynamic scripting.

SCaVis can be used with several scripting languages for the Java platform, such as BeanShell, Jython (the Python programming language), Groovy and JRuby (Ruby programming language). This brings more power and simplicity for scientific computation. The programming can also be done in native Java. Finally, symbolic calculations can be done using Matlab/Octave high-level interpreted language.


Schur is a stand alone C program for interactively calculating properties of Lie groups and symmetric functions. Schur has been designed to answer questions of relevance to a wide range of problems of special interest to chemists, mathematicians and physicists - particularly for persons who need specific knowledge relating to some aspect of Lie groups or symmetric functions and yet do not wish to be encumbered with complex algorithms. The objective of Schur is to supply results with the complexity of the algorithms hidden from view so that the user can effectively use Schur as a scratch pad, obtaining a result and then using that result to derive new results in a fully interactive manner. Schur can be used as a tool for calculating branching rules, Kronecker products, Casimir invariants, dimensions, plethysms, S-function operations, Young diagrams and their hook lengths etc.

As well as being a research tool Schur forms an excellent tool for helping students to independently explore the properties of Lie groups and symmetric functions and to test their understanding by creating simple examples and moving on to more complex examples. The user has at his or her disposal over 160 commands which may be nested to give a vast variety of potential operations. Every command, with examples, is described in a 200 page manual. Attention has been given to input/output issues to simplify input and to give a well organized output. The output may be obtained in TeX form if desired. Log files may be created for subsequent editing. On line help files may be brought to screen at any time.


ScientiFig is a free tool to help you create, format or reformat scientific figures.


The SciRuby Project aims to provide Ruby with scientific capabilities similar to what the wonderful NumPy and SciPy libraries bring to Python. Our goal is to provide a complete suite of statistical, numerical, and visualization software tools for scientific computing.


SCIRun is a problem solving environment or "computational workbench" in which a user selects software modules that can be connected in a visual programing environment to create a high level workflow for experimentation. Each module exposes all the available parameters necessary for scientists to adjust the outcome of their simulation or visualization. The networks in SCIRun are flexible enough to enable duplication of networks and creation of new modules.

Many SCIRun users find this software particularly useful for their bioelectric field research. Their topics of investigation include cardiac electro-mechanical simulation, ECG and EEG forward and inverse calculations, modeling of deep brain stimulation, electromyography calculation, and determination of the electrical conductivity of anisotropic heart tissue. Users have also made use of SCIRun for the visualization of breast tumor brachytherapy, computer aided surgery, teaching, and a number of non-biomedical applications.

SciTools Github


A Python WMS service for geospatial gridded data (Only triangular unstructured meshes and logically rectangular grids officially supported at this time).


My First 5 Minutes On A Server; Or, Essential Security for Linux Servers


EncFS provides an encrypted filesystem in user-space. It runs with regular user permissions using the FUSE library.


Fail2ban scans log files (e.g. /var/log/apache/error_log) and bans IPs that show the malicious signs — too many password failures, seeking for exploits, etc. Generally Fail2Ban is then used to update firewall rules to reject the IP addresses for a specified amount of time, although any arbitrary other action (e.g. sending an email) could also be configured. Out of the box Fail2Ban comes with filters for various services (apache, courier, ssh, etc).

Fail2Ban is able to reduce the rate of incorrect authentications attempts however it cannot eliminate the risk that weak authentication presents. Configure services to use only two factor or public/private authentication mechanisms if you really want to protect services.


A minimalist Delicious clone you can install on your own website. It is designed to be personal (single-user), fast and handy. It is used to store and share web links.



A command line shell for process supervision suites of the daemontools family. Currently, it supports daemontools, perp, s6 and runit. It provides a unified interface allowing easy inspection and manipulation of services (i.e. processes) managed by these supervisors.


In high-end computing environments, remote file transfers of very large data sets to and from computational resources are commonplace as users are typically widely distributed across different organizations and must transfer in data to be processed and transfer out results for further analysis. Local transfers of this same data across file systems are also frequently performed by administrators to optimize resource utilization when new file systems come on-line or storage becomes imbalanced between existing file systems. In both cases, files must traverse many components on their journey from source to destination where there are numerous opportunities for performance optimization as well as failure. A number of tools exist for providing reliable and/or high performance file transfer capabilities, but most either do not support local transfers, require specific security models and/or transport applications, are difficult for individual users to deploy, and/or are not fully optimized for highest performance.

Shift is a framework for Self-Healing Independent File Transfer that provides high performance and resilience for local and remote transfers through a variety of techniques. These include end-to-end integrity via cryptographic hashes, throttling of transfers to prevent resource exhaustion, balancing transfers across resources based on load and availability, and parallelization of transfers across multiple source and destination hosts for increased redundancy and performance. In addition, Shift was specifically designed to accommodate the diverse heterogeneous environments of a widespread user base with minimal assumptions about operating environments. In particular, Shift is unique in its ability to provide advanced reliability and automatic single and multi-file parallelization to any stock command-line transfer application while being easily deployed by both individual users as well as entire organizations.


The Scalable HeterOgeneous Computing (SHOC) benchmark suite is a collection of benchmark programs testing the performance and stability of systems using computing devices with non-traditional architectures for general purpose computing. Its initial focus is on systems containing Graphics Processing Units (GPUs) and multi-core processors, and on the OpenCL programming standard. It can be used on clusters as well as individual hosts.


The Super Instruction Architecture (SIA) is an environment comprising a programming language, SIAL, and a runtime system, SIP, with the goal of providing portable and efficient code related to tensor computations for a wide array of computing environments including distributed-memory environments [20]. SIAL exposes commonly used abstractions in scientific computing, such as blocking, providing the user a useful method of describing how an algorithm proceeds without unnecessarily complicating the code. Programs written in SIAL are compiled to a bytecode which is then interpreted by an SIP virtual machine which handles the execution of the program. Additionally, the SIP handles difficulties associated with parallelism, thus hiding this aspect of the program from the user. Distributed-memory parallelism in the SIP is handled through the use of asynchronous communication routines to aid in effectively overlapping computation with communication.


This project is a SimTK toolset providing general multibody dynamics capability, that is, the ability to solve Newton’s 2nd law F=ma in any set of generalized coordinates subject to arbitrary constraints. (That’s Isaac himself in the oval.) Simbody is provided as an open source, object-oriented C++ API and delivers high-performance, accuracy-controlled science/engineering-quality results.

Simbody uses an advanced Featherstone-style formulation of rigid body mechanics to provide results in Order(n) time for any set of n generalized coordinates. This can be used for internal coordinate modeling of molecules, or for coarse-grained models based on larger chunks. It is also useful for large-scale mechanical models, such as neuromuscular models of human gait, robotics, avatars, and animation. Simbody can also be used in real time interactive applications for biosimulation as well as for virtual worlds and games.

This toolset was developed originally by Michael Sherman at the Simbios Center at Stanford, with major contributions from Peter Eastman and others. Simbody descends directly from the public domain NIH Internal Variable Dynamics Module (IVM) facility for molecular dynamics developed and kindly provided by Charles Schwieters. IVM is in turn based on the spatial operator algebra of Rodriguez and Jain from NASA’s Jet Propulsion Laboratory (JPL), and Simbody has adopted that formulation.

See also PyCraft


SIMD.js is a new API being developed by Intel, Google, and Mozilla for JavaScript which introduces several new types and functions for doing SIMD computations. For example, the Float32x4 type represents 4 float32 values packed up together. The API contains functions to operate on those values together, including all the basic arithmetic operations, and operations to rearrange, load, and store such values. The intent is for browsers to implement this API directly, and provide optimized implementations that make use of SIMD instructions in the underlying hardware.

The SIMD.js API itself is in active development. The ecmascript_simd github repository is currently serving as a provision specification as well as providing a polyfill implementation to provide the functionality, though of course not the accelerated performance, of the SIMD API on existing browsers. It also includes some benchmarks which also serve as examples of basic SIMD.js usage.


Once you have generated a discrete problem you wish to translate this abstract formulation to specific code on a certain simulation platform. Simflowny is designed as an extensible framework on which plug-ins for different simulation platforms can be easily added. The current version provides support for the Cactus simulation framework and for the SAMRAI mesh management system. Both Cactus and SAMRAI provide parallelization by leveraging MPI-based communication between computers, which permits running simulations on clusters and taking advantage of multiple cores in modern chips.

Simflowny generates Fortran code for Cactus and C++ code for SAMRAI. It is also capable of compiling and linking a final binary that can be independently used as a simulation software. Alternatively, Simflowny also provides a GUI to manage simulations within the platform. Simulations may be launched locally, or remotely, by connecting to a Grid infrastructure.

Output both in Cactus and SAMRAI is mainly generated through HDF5 files, which contain snapshots from certain instants in the simulation. These results may be visualized with a number of commercial and free visualization tools.


SimGrid is a scientific instrument to study the behavior of large-scale distributed systems such as Grids, Clouds, HPC or P2P systems. It can be used to evaluate heuristics, prototype applications or even assess legacy MPI applications.


This project provides tools for postprocessing data on triangular grids (simplex cells), such as computing meridional and barotropic stream functions and several transports through user defined slices. The data are interpolated onto a regular grid of user defined mesh size, equidistant in each (horizontal) coordinate direction. Postprocessing takes place on this regular grid.


A web-based scientific application deployment and visualization framework for coastal modeling and beyond.