Notes about how to install and use cool software.
Machine Learning and Data Mining Info
Discovering and Visualizing Patterns with Python
PDF cheatsheet (7 pp.)
Weblog of cheatsheet author:
Machine Learning: An Algorithmic Perspective
A book with lots of Python examples, the code for which is available at the link shown.
Neural Network Emulations for Complex Multidimensional Geophysical Mappings
PDF review paper (34 pp.)
Predicting Solar Energy from Weather Forecasts Using Python
Using Python to read data from NetCDF files and then perform data mining.
Application of Machine Learning Methods to Spatial Interpolation of Environmental Variables
PDF paper (13 pp.)
Review of Spatial Interpolation Methods for Environmental Scientists
PDF technical report (154 pp.)
PDF review paper (46 pp.)
Comparing Predictive Power in Climate Data: Clustering Matters
PDF paper (17 pp.)
Applying Machine Learning Methods to Climate Variability
Nonlinear Multivariate and Time Series Analysis by Neural Network Methods
Pattern Recognition in Time Series
PDF paper (28 pp.)
Application of Statistical Learning to Plankton Image Analysis
Machine Learning Algorithms for Real Data Sources with Applications to Climate Science
PDF slides (46 pp.)
Machine Learning for Climate Science
Online slides (196 pp.)
Applicability of Data Mining Techniques for Climate Prediction
PDF paper (4 pp.)
Outstanding Problems at the Interface of Climate Prediction and Data Mining
Online slides (35 pp.)
Unsupervised Machine Learning Techniques for Studying Climate Variability
PDF slides (21 pp.)
Tracking Climate Models
PDF paper (15 pp.)
Streaming Data Mining
PDF slides (229 pp.)
Machine Learning for Hackers
Book (324 pp.) with examples using R.
Python and Matlab
A software framework in Fortran to build large-scale parallel applications. It is designed for applications using three-dimensional structured mesh and spatially implicit numerical algorithms. At the foundation it implements a general-purpose 2D pencil decomposition for data distribution on distributed-memory platforms. On top it provides a highly scalable and efficient interface to perform three-dimensional distributed FFTs. The library is optimised for supercomputers and scales well to hundreds of thousands of cores. It relies on MPI but provides a user-friendly programming interface that hides communication details from application developers.
A programming environment for heterogeneous architectures.
A set of DOE-developed software tools, sometimes in collaboration with other funding agencies (DARPA, NSF), that make it easier for programmers to write high performance scientific applications for high-end computers.
The Advanced Data mining And Machine learning System (ADAMS) is a novel, flexible workflow engine aimed at quickly building and maintaining real-world, complex knowledge workflows.
A system of computer programs for solving time dependent, free surface circulation and transport problems in two and three dimensions. These programs utilize the finite element method in space allowing the use of highly flexible, unstructured grids.
The Adaptable IO System (ADIOS) provides a simple, flexible way for scientists to describe the data in their code that may need to be written, read, or processed outside of the running simulation. By providing an external to the code XML file describing the various elements, their types, and how you wish to process them this run, the routines in the host code (either Fortran or C) can transparently change how they process the data.
A software library designed to help rapidly build scalable parallel programs.
An adaptive mesh refinement package written in Fortran 90.
An opensource object-oriented Finite Element library which has the ambition to be generic and efficient. Akantu is developed within the LSMS (Computational Solid Mechanics Laboratory, lsms. epfl.ch), where research is conducted at the interface of mechanics, material science, and scientific computing. The open-source philosophy is important for any scientific software project evolution. The collaboration permitted by shared codes enforces sanity when users (and not only developers) can criticize the implementation details. Akantu was born with the vision to associate genericity, robustness and efficiency while benefiting the open-source visibility.
A software package providing a series of algorithms for statistical relational learning and probabilistic logic inference, based on the Markov logic representation.
The Adaptive Mesh generator for Atmospheric and Ocean Simulation is a mesh generator for adaptive algorithms. It is capable of handling complex geometries as well as highly non-uniform refinement regions. It has a relatively simple programming interface and incorporates some optimization. There is even a 3D version of amatos.
The Adaptive Message Passing Interface is an implementation of MPI that supports dynamic load balancing and multithreading for MPI applications.
A machine independent parallel programming system. Programs written using this system will run unchanged on MIMD machines with or without a shared memory. It provides high-level mechanisms and strategies to facilitate the task of developing even highly complex parallel applications.
The Astrophysical Multipurpose Software Environment provides a software framework astrophysical simulations, in which existing codes from different domains, such as stellar dynamics, stellar evolution, hydrodynamics and radiative transfer can be easily coupled. AMUSE uses Python to interface with existing numerical codes. The AMUSE interface handles unit conversions, provides consistent object oriented interfaces, manages the state of the underlying simulation codes and provides transparent distributed computing.
A collection of Fortran77 subroutines designed to solve large scale eigenvalue problems.
A parallel server for adaptive geoinformation.
A free open-source software program for solving small to very large mathematical models. ASCEND can solve systems of non-linear equations, linear and nonlinear optimisation problems, and dynamic systems expressed in the form of differential/algebraic equations.
A SEJITS implementation for Python. Asp is a research prototype and implementation of SEJITS (Selective, Embedded Just-in-Time Specialization) for Python. With the aid of application-specific specializers, it compiles fragments of Python down to low-level parallelized CPU and GPU implementations.
A Python web framework that makes the most of the filesystem. Simplates are the main attraction.
AstroML is a Python module for machine learning and data mining built on numpy, scipy, scikit-learn, and matplotlib, and distributed under the 3-clause BSD license. It contains a growing library of statistical and machine learning routines for analyzing astronomical data in python, loaders for several open astronomical datasets, and a large suite of examples of analyzing and visualizing astronomical datasets. _images/text_cover.png
The goal of astroML is to provide a community repository for fast Python implementations of common tools and routines used for statistical data analysis in astronomy and astrophysics, to provide a uniform and easy-to-use interface to freely available astronomical datasets. We hope this package will be useful to researchers and students of astronomy.
A high-performance language interoperability tool.
A set of tools that parses C++ and Fortran 90 source files and automatically generates bridging code to provide for seamless language interoperability.
A framework for analyzing source code written in several programming languages and for making rich program knowledge accessible to developers of static and dynamic analysis tools. PDT implements a standard program representation, the program database (PDB), that can be accessed in a uniform way through a class library supporting common PDB operations.
A tool to create an XML representation of a GNU Fortran parse tree.
An open-source toolbox and development platform for viewing, analysing and processing of remote sensing raster data. Originally developed to facilitate the utilisation of image data from Envisat’s optical instruments, BEAM now supports a growing number of other raster data formats such as GeoTIFF and NetCDF as well as data formats of other EO sensors such as MODIS, AVHRR, AVNIR, PRISM and CHRIS/Proba. Various data and algorithms are supported by dedicated extension plug-ins.
A full implementation of SRM v2.2, developed by Lawrence Berkeley National Laboratory, for disk based storage systems and mass storage systems such as HPSS. End users may have their own personal BeStMan that manages and provides an SRM interface to their local disks or storage systems. It works on top of existing disk-based unix file system, and has been reported so far to work on file systems such as NFS, PVFS, AFS, GFS, GPFS, PNFS, and Lustre. It also works with any existing file transfer service, such as gsiftp, http, https and ftp. It requires the minimal administrative efforts on the deployment and maintenance.
BeStMan2 is a Jetty based implementation of SRM v2.2, as opposed to globus container based implementation in the previous BeStMan. All the rest of functionalities and features are the same.
A C++ template library for the discretisation of boundary integral operators as they arise in various physical and engineering applications. Prominent examples are, e.g., electrostatic or thermal models as well as the scattering of acoustic and electromagnetic waves. While BETL currently implements the discretisation of 3-dimensional boundary integral operators via Galerkin schemes its design principles allow also for the incorporation of other discretisation schemes such as, e.g., the still popular collocation methods.
The Bespoke Framework Generator (BFG) is a prototype implementation of the Flexible Coupling Approach (FCA). The BFG specifies single model rules to which a conformant model implementation must adhere; it also defines XML schemas to capture metadata describing the conformant models, their scientific composition and their deployment onto resources. The BFG engine (written in xsl) then processes the resultant (user specified) XML, producing appropriate "wrapper code" within which the models can execute.
Sync is unlimited, secure file-syncing. You can use it for remote backup. Or, you can use it to transfer large folders of personal media between users and machines; editors and collaborators. It’s simple. It’s free. It’s the awesome power of P2P, applied to file-syncing.
A comprehensive black-hole event generator, which simulates the experimental signature of microscopic and Planckian black-hole production and evolution at the LHC in the context of brane-world models with low-scale quantum gravity. The generator is based on phenomenologically realistic models free of serious problems that plague low-scale gravity, thus offering more realistic predictions for hadron-hadron colliders. The generator includes all of the black-hole gray-body factors known to date and incorporates the effects of black-hole rotation, splitting between the fermions, non-zero brane tension and black-hole recoil due to Hawking radiation (although not all simultaneously).
The next generation of Numpy.
BlobSeer is a large-scale distributed storage service that addresses advanced data management requirements resulting from ever-increasing data sizes. It is centered around the idea of leveraging versioning for concurrent manipulation of binary large objects in order to efficiently exploit data-level parallelism and sustain a high throughput despite massively parallel data access.
A blocking, shuffling and loss-less compression library. Blosc is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call. Blosc is the first compressor (that I’m aware of) that is meant not only to reduce the size of large datasets on-disk or in-memory, but also to accelerate memory-bound computations (which is typical in vector-vector operations).
A free signal-processing and machine learning toolbox. The toolbox is written in a mix of Python and C++ and is designed to be both efficient and reduce development time. The capabilities include mathematical and signal processing, image processing, machine learning, storing and managing data and database support.
Interactive web plotting with Python. Bokeh is a Python interactive visualization library for large datasets that natively uses the latest web technologies. Its goal is to provide elegant, concise construction of novel graphics in the style of Protovis/D3, while delivering high-performance interactivity over large data to thin clients.
A Python interface to Amazon Web Services.
A collection of fast NumPy array functions written in Cython.
A complete network graphing solution designed to harness the power of RRDTool’s data storage and graphing functionality. Cacti provides a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with hundreds of devices.
The OpenSource industry standard, high performance data logging and graphing system for time series data. RRDtool can be easily integrated in shell scripts, perl, python, ruby, lua or tcl applications.
Cactus is an open source problem solving environment designed for scientists and engineers. Its modular structure easily enables parallel computation across different architectures and collaborative code development between different groups. Cactus originated in the academic research community, where it was developed and used over many years by a large international collaboration of physicists and computational scientists.
The name Cactus comes from the design of a central core ("flesh") which connects to application modules ("thorns") through an extensible interface. Thorns can implement custom developed scientific or engineering applications, such as computational fluid dynamics. Other thorns from a standard computational toolkit provide a range of computational capabilities, such as parallel I/O, data distribution, or checkpointing.
Cactus runs on many architectures. Applications, developed on standard workstations or laptops, can be seamlessly run on clusters or supercomputers. Cactus provides easy access to many cutting edge software technologies being developed in the academic research community, including the Globus Metacomputing Toolkit, HDF5 parallel file I/O, the PETSc scientific library, adaptive mesh refinement, web interfaces, and advanced visualization tools.
A computer algebra system (CAS) designed specifically for the solution of problems encountered in field theory. It has extensive functionality for tensor computer algebra, tensor polynomial simplification including multi-term symmetries, fermions and anti-commuting variables, Clifford algebras and Fierz transformations, implicit coordinate dependence, multiple index types and many more. The input format is a subset of TeX. Both a command-line and a graphical interface are available.
The Cameleon language is a graphical data flow language following a two-scale paradigm. It allows an easy up-scale that is the integration of any library writing in C++ in the data flow language. Cameleon language aims to democratize macro-programming by an intuitive interaction between the human and the computer where building an application based on a data-process and a GUI is a simple task to learn and to do. Cameleon language allows conditional execution and repetition to solve complex macro-problems. In this paper we introduce a new model based on the extension of the petri net model for the description of how the Cameleon language executes a composition.
A library providing cartographic tools for Python.
A set of libraries for performing various tasks in radoi astronomy.
Python bindings for the casacore radio astronomy libraries.
A simple, portable, high-performance, scalable, and robust communication interface for HPC and Data Centers. Targeted towards high performance computing (HPC) environments as well as large data centers, CCI can provide a common network abstraction layer (NAL) for persistent services as well as general interprocess communication. In HPC, MPI is the de facto standard for communication within a job. Persistent services such as distributed file systems, code coupling (e.g. a simulation sending output to an analysis application sending its output to a visualization process), health monitoring, debugging, and performance monitoring, however, exist outside of scheduler jobs or span multiple jobs. In these cases, these persistent services tend to use either BSD sockets for portability to avoid having to rewrite the applications for each new interconnect or they implement their own NAL which takes developer time and effort. CCI can simplify support for these persistent services by providing a common NAL which minimizes the maintenance and support for these services while providing improved performance (i.e. reduced latency and increased bandwidth) compared to Sockets.
A C and Fortran Interface to access Climate and NWP model Data. Supported data formats are GRIB, netCDF, SERVICE, EXTRA and IEG.
A large tool set for working on climate and NWP model data. NetCDF 3/4, GRIB 1/2 including SZIP and JPEG compression, EXTRA, SERVICE and IEG are supported as IO-formats. Apart from that CDO can be used to analyse any kind of gridded data not related to climate science. CDO has very small memory requirements and can process files larger than the physical memory.
An asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well. The execution units, called tasks, are executed concurrently on a single or more worker servers using multiprocessing, Eventlet, or gevent. Tasks can execute asynchronously (in the background) or synchronously (wait until ready).
Robust messaging for applications.
An advanced key-value store often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets.
Cello is a GNU99 C library which brings higher level programming to C.
A package of high-resolution central schemes for nonlinear conservation laws and related problems.
A compiler infrastructure for the source-to-source transformation of software programs.
Dedicated to open source high performance scientific computing in fluid mechanics and particle science
An emerging parallel programming language whose design and development are being led by Cray Inc. in collaboration with academia, computing centers, and industry. Chapel’s goal is to make parallel programming more productive, from high-end supercomputers to commodity clusters and multicore desktops and laptops. Chapel is being developed in an open-source manner at SourceForge and is released under the BSD license.
Chapel supports a multithreaded execution model via high-level abstractions for data parallelism, task parallelism, concurrency, and nested parallelism. Chapel’s locale type enables users to specify and reason about the placement of data and tasks on a target architecture in order to tune for locality. Chapel supports global-view data aggregates with user-defined implementations, permitting operations on distributed data structures to be expressed in a natural manner. In contrast to many previous higher-level parallel languages, Chapel is designed around a multiresolution philosophy, permitting users to initially write very abstract code and then incrementally add more detail until they are as close to the machine as their needs require. Chapel supports code reuse and rapid prototyping via object-oriented design, type inference, and features for generic programming.
Chapel was designed from first principles rather than by extending an existing language. It is an imperative block-structured language, designed to be easy to learn for users of C, C++, Fortran, Java, Python, Matlab, and other popular languages. While Chapel builds on concepts and syntax from many previous languages, its parallel features are most directly influenced by ZPL, High-Performance Fortran (HPF), and the Cray MTA™/Cray XMT™ extensions to C and Fortran.
A high-performance language interoperability tool that generates Babel-compatible bindings for the Chapel programming language. For details on using the command-line tool, please consult the BRAID man page and the Babel user’s guide.
Build to Order BLAS
The Build to Order BLAS system is a compiler that generates high-performance implementations of basic linear algebra kernels.
The term BLAS in the name is for Basic Linear Algebra Subprograms. The BLAS is a standard API for important linear algebra operations. The BLAS are implemented by most hardware vendors. Traditionally, each routine in the BLAS is implemented by hand by a highly skilled programmer. The Build to Order BLAS compiler automates the implementation of not only the BLAS standard but also any sequence of basic linear algebra operations.
The user of the Build to Order BLAS compiler writes down a specification for a sequence of matrix and vector operations together with a description of the input and output parameters. The compiler then tries out many different choices of how to implement, optimize, and tune those operations for the user’s computer hardware. The compiler choices the best option, which is output as a C file containing a function that implements the specified operations.
Demonstrates the theory of convolution underlying engineering systems and signal analysis. Designed to enhance the learning experience, C-Graph features an attractive array of scalable pulses, periodic, and aperiodic signal types of variable frequency fundamental to the study of systems theory. The package displays the spectra of any two waveforms chosen by the user, computes their linear convolution, then compares their circular convolution according to the convolution theorem. Each signal is modelled by a register of N discrete values (samples), and the discrete Fourier Transform (DFT) computed by the Fast Fourier Transform (FFT). Students of signal and systems theory will find GNU C-Graph to be of value in visualizing convolution.
A versatile genetic programming application which includes a command-line client and an interactive console mode. It features built in input-output mapping support, and is user-extensible for complex fitness evaluation in Python and Lisp.
Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora’s capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.
Adds simple language extensions to the C and C++ languages to express task and data parallelism. These language extensions are powerful, yet easy to apply and use in a wide range of applications.
This was an MIT research program that got folded into the commercially available Intel C++ Compiler Suite. There is a branch of the GCC compiler development stack that’s also in the process of including Cilk.
The main objectives of the METAFOR project were to develop and promulgate an ipso-facto standard for describing climate models and associated data. This standard has been formalized and named the Common Information Model (CIM). Adoption of the CIM standard will allow the climate science community to nurture an eco-system of CIM compliant tools and services to be integrated into the day to day activities of climate research institutes worldwide. The CIM is an ontology, i.e. an informational model describing a particular domain (i.e. climate science). Such a model is formed using a construct known as a class (e.g. simulation). Classes form relationships with other classes (e.g. a simulation has data). Related classes are grouped into packages. The CIM is formally defined using the Unified Modelling Language.
The Coupled-Layer Architecture for Robotic Autonomy is a reusable robotic software framework. CLARAty is a framework that promotes reusable robotic software. It was designed to support heterogeneous robotic platforms and integrate advanced robotic capabilities from multiple institutions. Consequently, its design had to be portable, modular, flexible and extendable.
A lightweight Clifford algebra template library.
A Python-based software component toolkit providing a flexible problem-solving environment for climate science problems. CliMT consists of two layers: a library of climate modeling components (radiative and convective schemes, dynamical cores etc.), mostly in Fortran; and a Python superstructure providing standardized access to each component and allowing coupling of components to form time-dependent models.
Robustly detects extremes against a time-dependent background in climate and weather time series.
A freely* available software tool for 3D visualizations and scientific calculations that was conceived and written by Dr. Christian Perwass. CLUCalc interprets a script language called ‘CLUScript’, which has been designed to make mathematical calculations and visualisations very intuitive.
An implementation of the constrained natural element method in 2D and 3D. It is written in C++ and has Python and Matlab wrappers.
A SPMD parallel programming model based on a small set of language extensions to Fortran 90. CAF supports access to non-local data using a natural extension to Fortran 90 syntax, lightweight and flexible synchronization primitives, pointers, and dynamic allocation of shared data. An executing CAF program consists of a static collection of asynchronous process images.Rice’s implementation of Coarray Fortran 2.0 is a work in progress. We are working to create an open-source, portable, retargetable, high-quality CAF 2.0 compiler suitable for use with production codes. To achieve portability, our compiler performs a source-to-source translation from CAF to Fortran 90 with calls to our CAF 2.0 runtime library primitives. Our CAF compiler’s generated code can be compiled by any Fortran 90 compiler that supports Cray pointers. To achieve high performance, we generate Fortran 90 that is readily optimizable by vendor compilers. Our CAF 2.0 runtime library uses UC Berkeley’s GASNet library as a substrate for communication. GASNet’s get and put operations are used to read and write remote coarray elements. GASNet’s active message support is used to invoke operations on remote nodes. This capability is used to form teams and to look up information about remote coarrays so that process images can read and write them directly.
The Common Data Access toolbox (CODA) provides a set of interfaces for reading remote sensing data from earth observation data files. These interfaces consist of command line applications, libraries, interfaces to scientific applications (such as IDL and MATLAB), and interfaces to programming languages (such as C, Fortran, Python, and Java).
CODA provides a single interface to access data in a wide variety of data formats, including ASCII, binary, XML, netCDF, HDF4, HDF5, GRIB, RINEX, and SP3. This is done by using a generic high level type hierarchy mapping for each data format. For self describing formats such as netCDF, HDF, and GRIB, CODA will automatically construct this mapping based on the file itself. For raw ASCII and binary (and partially also XML) formats CODA makes use of an external format definition stored in .codadef files to determine this mapping. On the download section of this website you will find .codadef files for various earth observation missions that can be used with CODA.
The COllaborative DEvelopment SHell project provides an automatic persistent logbook for sessions of personal command-line work by recording what and how is being done: for private use/reuse and for sharing selected parts with collaborators.
The primary interface for managing Anaconda installations. It can query and search the Anaconda package index and current Anaconda installation, create new Anaconda environments, and install and update packages into existing Anaconda environments.
A scientific tool for the numerical integration of dynamical systems whose mutual couplings are described by a network. Its name is an abbreviation of “Complex Networks Dynamics”.
Conedy supports different dynamical systems with various integration schemes, including ordinary differential equations, iterated maps, stochastic differential equations, and pulse coupled oscillators which are handled via events. In addition, it provides a simple way to handle arbitrary node dynamics. Each dynamical system is associated with a node in a network and edges between such nodes represent couplings. Conedy provides functions to build a network from various node and edge types.
Connectivity Modeling System
A community multiscale modeling system, based on a stochastic Lagrangian framework. It was developed to study complex larval migrations and give probability estimates of population connectivity. In addition, the CMS can also provide a Lagrangian descriptions of oceanic phenomena (advection, dispersion, retention) and can be used in a broad range of applications, from the dispersion and fate of pollutants to marine spatial conservation.
ConTeXt can be used to typeset complex and large collections of documents, like educational materials, user guides and technical manuals. Such documents often have high demands regarding structure, design and accessibility. Ease of maintenance, reuse of content and typographic consistency are important prerequisites. ConTeXt is developed for those who are responsible for producing such documents. ConTeXt is written in the typographical programming language TeX. For using ConTeXt, no TeX programming skills and no technical background are needed. Some basic knowledge of typography and document design will enable you to use the full power of ConTeXt.
A collection of open-source optimization-related Python packages that supports a diverse set of optimization capabilities for formulating and analyzing optimization models.
A data parallel subset of Python which can be dynamically compiled and executed on parallel platforms. Currently, we target NVIDIA GPUs, as well as multicore CPUs through OpenMP and Threading Building Blocks (TBB).
A generic web service and offline processing tool developed within the Centre for Environmental Data Archival (CEDA). The CEDA OGC web services (COWS) is a set of Python libraries that allow rapid development and deployment of geospatial web applications and services built around the standards managed by the Open Geospatial Consortium [OGC]. A Python software framework for implementing Open Geospatial Consortium web service standards. COWS emphasises rapid service development by providing a lightweight layer of OGC web service logic on top of Pylons [Pylons], a mature web application framework for the Python language. This approach provides developers with a flexible web service development environment without compromising access to the full range of web application tools and patterns: Model-View-Controller paradigm, XML templating, Object-Relational-Mapper integration and authentication/authorisation. COWS contains pre-configured implementations of WMS, WCS and WFS services, a web client and WPS.
CPython Compiler Tools
Various compiler tools for Python.
A library for multidimensional numerical integration. The Cuba library offers a choice of four independent routines for multidimensional numerical integration: Vegas, Suave, Divonne, and Cuhre. All four have a C/C++, Fortran, and Mathematica interface and can integrate vector integrands. Their invocation is very similar, so it is easy to substitute one method by another for cross-checking. For further safeguarding, the output is supplemented by a chi-square probability which quantifies the reliability of the error estimate.
Light-weight Python framework and OLAP HTTP server for easy development of reporting applications and aggregate browsing of multi-dimensionally modeled data.
The Community Surface Dynamics Modeling System (CSDMS) deals with the Earth’s surface - the ever-changing, dynamic interface between lithosphere, hydrosphere, cryosphere, and atmosphere. We are a diverse community of experts promoting the modeling of earth surface processes by developing, supporting, and disseminating integrated software modules that predict the movement of fluids, and the flux (production, erosion, transport, and deposition) of sediment and solutes in landscapes and their sedimentary basins.
A ") is a suite engine and meta-scheduler that specializes in suites of cycling tasks for weather and climate forecasting and related processing (it can also be used for one-off workflows of non-cycling tasks, which is a simpler problem).
A middleware targeting multicore HPC platforms. It proposes to dedicate one core to I/O, data-processing prior to effective storage in a parallel file system or in-situ visualization. It provides an extremely simple API and can be easily integrated in existing large-scale simulations. Damaris can be seamlessly connected to the VisIt visualization software in order to provide in-situ visualization capabilities with low impact on the running simulation.
DART is a community facility for ensemble DA developed and maintained by the Data Assimilation Research Section (DAReS) at the National Center for Atmospheric Research (NCAR). DART provides modelers, observational scientists, and geophysicists with powerful, flexible DA tools that are easy to implement and use and can be customized to support efficient operational DA applications. DART is a software environment that makes it easy to explore a variety of data assimiliation methods and observations with different numerical models and is designed to facilitate the combination of assimilation algorithms, models, and real (as well as synthetic) observations to allow increased understanding of all three.
A software package for numerical simulation of river hydraulics (2D / 1D). It is designed especially for parameter identification, calibration and variational data assimilation. It is interfaced with few pre and post-processors.
A lightweight data management application developed in Python that primarily targets the management of huge data accumulations, often encountered in the scientific field. The system is able to handle large amounts of data and can be easily integrated in existing working environments. It can be optimised to fit any situation by embedding scripts.
A robust real-time streaming data engine that lets you quickly stream live data from experiments, labs, web cams and even Java enabled cell phones. It acts as a "black box" to which applications and devices send and receive data. Think of it as express delivery for your data, be it numbers, video, sound or text. DataTurbine is a buffered middleware, not simply a publish/subscribe system. It can receive data from various sources (experiments, web cams, etc) and send data to various sinks (visualization interfaces, analysis tools, databases, etc). It has "TiVO" like functionality that lets applications pause and rewind live streaming data.
A crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data.
A novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data structures transparent. It works in perfect harmony with parallelisation mechanism such as multiprocessing and SCOOP.
A pseudospectral solver for fluid equations. Its primary applications are in Astrophysics and Cosmology. Written primarily in python, and making use of the FFTW libraries, Dedalus aims to be a simple, fast, and elegant hydrodynamic and magnetohydrodynamic code.
An open source software for spatial data infrastructures and the geospatial web. deegree includes components for geospatial data management, including data access, visualization, discovery and security. Open standards are at the heart of deegree. The software is built on the standards of the Open Geospatial Consortium (OGC) and the ISO Technical Committee 211. It includes the OGC Web Map Service (WMS) reference implementation, a fully compliant Web Feature Service (WFS) as well as packages for Catalogue Service (CSW), Web Coverage Service (WCS), Web Processing Service (WPS) and Web Map Tile Service (WMTS).
A modeling suite to investigate hydrodynamics, sediment transport and morphology and water quality for fluvial, estuarine and coastal environments. The FLOW module is the heart of Delft3D and is a multi-dimensional (2D or 3D) hydrodynamic (and transport) simulation programme which calculates non-steady flow and transport phenomena resulting from tidal and meteorological forcing on a curvilinear, boundary fitted grid or sperical coordinates. In 3D simulations, the vertical grid is defined following the so-called sigma coordinate approach or Z-layer approach. The MOR module computes sediment transport (both suspended and bed total load) and morphological changes for an arbitrary number of cohesive and non-cohesive fractions. Both currents and waves act as driving forces and a wide variety of transport formulae have been incorporated. For the suspended load this module connects to the 2D or 3D advection-diffusion solver of the FLOW module; density effects may be taken into account. An essential feature of the MOR module is the dynamic feedback with the FLOW and WAVE modules, which allow the flows and waves to adjust themselves to the local bathymetry and allows for simulations on any time scale from days (storm impact) to centuries (system dynamics). It can keep track of the bed composition to build up a stratigraphic record. The MOR module may be extended to include extensive features to simulate dredging and dumping scenarios.
A ightweight job execution control framework for parallel scientific applications. DIANE improves the reliability and efficiency of job execution by providing automatic load balancing, fine-grained scheduling and failure recovery. DIANE provides an environment in which the existing applications may be more easily ported to heterogenous computing environments such as the Grid, batch farms or interactive clusters. The default scheduling plugin algorithms are suited for bag of tasks applications and data-parallel problems with no inter-task communication. However the framework is designed to make it easy to plug in other scheduling algorithms for more complex task synchronization patterns and workflows, for example DAG4DIANE plugin provides support for directed acyclic graph (DAG) applications, MOTEUR plugin provides support for workflow applications.
An EOF-based method to fill in missing data from geophysical fields, such as clouds in sea surface temperature.
A C++ library for computing persistent homology.
A lightweight, open-source framework for distributed computing based on the MapReduce paradigm.
A version of NumPy that parallelizes array operations in a manner completely transparent to the user - from the perspective of the user, the difference between NumPy and DistNumPy is minimal. DistNumPy can use multiple processors through the communication library Message Passing Interface (MPI). In DistNumPy MPI communication is fully transparent and the user needs no knowledge of MPI or any parallel programming model. However, the user is required to use the array operations in DistNumPy to obtain any kind of speedup.
DIVA (Data-Interpolating Variational Analysis) allows the spatial interpolation of data (analysis) in an optimal way, comparable to optimal interpolation (OI). In comparison to OI it takes into account coastlines, sub-basins and advection. Calculations are highly optimized and rely on a finite element resolution. Tools to generate the finite element mesh are provided as well as tools to optimize the parameters of the analysis. Quality control of data can be performed and error fields can be calculated.
An open-source package of scalable building blocks for data movement tailored to the needs of large-scale parallel analysis workloads. Scalable, parallel analysis of data-intensive computational science relies on the decomposition of the analysis problem among a large number of distributed-memory compute nodes, the efficient data exchange among them, and data transport between compute nodes and a parallel storage system. Configurable data partitioning, scalable data exchange, and efficient parallel I/O are the main components of DIY, a library that assists developers in parallelizing serial analysis algorithms by providing configurable, high-performance data movement algorithms built on top of MPI. Computational scientists, data analysis researchers, and visualization tool builders can all benefit from these tools.
DataMover-Lite is a simple file transfer tool with graphical user interface which supports multi-protocol data movement. It supports http, https, ftp, gridftp, lahfs and scp. For GridFTP, DML also supports directory browsing and transferring.
DOLFIN is the C++/Python interface of FEniCS, providing a consistent PSE (Problem Solving Environment) for ordinary and partial differential equations.
A package for building open digital repositories. It is free and easy to install "out of the box" and completely customizable to fit the needs of any organization. DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets. And with an ever-growing community of developers, committed to continuously expanding and improving the software, each DSpace installation benefits from the next.
The Distributed and Unified Numerics Environment is a modular toolbox for solving partial differential equations (PDEs) with grid-based methods. It supports the easy implementation of methods like Finite Elements (FE), Finite Volumes (FV), and also Finite Differences (FD).
A project to develop a new dynamical core for LMD-Z, the atmospheric general circulation model (GCM) part of IPSL-CM Earth System Model.
An an open architecture, open source public software for data acquisition, processing, archival and distribution. Originally developed by the United States Geological Survey, Earthworm binaries and source files are freely available to everyone.
Python wrapper for accessing an Earthworm shared memory ring.
A visual analytics tool for exploring multivariate data sets. EDEN helps you see the associations among variables for guided analysis. EDEN harnesses the parallel coordinates visualization technique and is augmented with graphical indicators of key descriptive statistics.
A C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.
Ellipsoidal Potential Theory
Open-source (BSD) implementations of ellipsoidal harmonic expansions for solving problems of potential theory using separation of variables.
An open source multiphysical simulation software mainly developed by CSC - IT Center for Science (CSC). Elmer development was started 1995 in collaboration with Finnish Universities, research institutes and industry. After it’s open source publication in 2005, the use and development of Elmer has become international.
Elmer includes physical models of fluid dynamics, structural mechanics, electromagnetics, heat transfer and acoustics, for example. These are described by partial differential equations which Elmer solves by the Finite Element Method (FEM).
Empirical Mode Decomposition is an algorithm that finds common rotational modes among all the channels of n-channel data, and is a generic multidimensional extension of the standard EMD.
This open source, digitizing software converts an image file showing a graph or map, into numbers. The image file can come from a scanner, digital camera or screenshot. The numbers can be read on the screen, and written or copied to a spreadsheet.
The process starts with an image file containing a graph or map. The final result is digitized data that can be used by other tools such as Microsoft Excel and Gnumeric.
The EnKF is a sophisticated sequental data assimilation method. It applies an ensemble of model states to represent the error statistics of the model estimate, it applies ensemble integrations to predict the error statistics forward in time, and it uses an analysis scheme which operates directly on the ensemble of model states when observations are assimilated. The EnKF has proven to efficiently handle strongly nonlinear dynamics and large state spaces and is now used in realistic applications with primitive equation models for the ocean and atmosphere.
Enthought Tool Suite
A suite of Python tools for constructing custom scientific applications.
A Python plotting application toolkit that facilitates writing plotting applications at all levels of complexity, from simple scripts with hard-coded data to large plotting programs with complex data interrelationships and a multitude of interactive tools. While Chaco generates attractive static plots for publication and presentation, it also works well for interactive data visualization and exploration.
Enaml is Not A Markup Language. Enaml is a library for creating professional quality user interfaces with minimal effort. Enaml combines a domain specific declarative language with a constraints based layout system to allow users to easily define rich UIs with complex and flexible layouts. Enaml applications can transparently run on multiple backends (Qt and Wx) and on multiple operating systems.
A Python package for 3D scientific visualization. The project includes Mayavi, a tool for easy, interactive visualization of data that’s integrated with Python scientific libraries, and TVTK, a Traits-based wrapper for VTK.
A trait is a type definition that can be used for normal Python object attributes, giving the attributes some additional characteristics such as initializatino, validation, delegation, notification and visualization. The Traits package was developed to address some of the problems caused by not having declared variable types, in those cases where problems might arise.
A collection of precompiled C libraries for timing, coordinate conversions, orbit propagation, satellite pointing calculations, and target visibility calculations. This software is made available by the Envisat project to any user involved in the Envisat mission preparation/exploitation.
The Earth Observation CFI software is a collection of precompiled C libraries for timing, coordinate conversions, orbit propagation, satellite pointing calculations, and target visibility calculations. This software is made available by the EOP system support division to any user involved in the Earth Observation missions preparation/exploitation. As of version 4.0, the Earth Observation CFI Software is available both as C and C++ precompiled libraries and Java libraries.
A server for earth observation data.
A programming tool for implementing mathematical models in python using the finite element method (FEM). As users do not access the data structures it is very easy to use and scripts can run on desktop computers as well as highly parallel supercomputer without changes. Application areas for escript include earth mantle convection, geophysical inversion, earthquakes, porous media flow, reactive transport, plate subduction, erosion, and tsunamis.
The Earth System Grid Federation (ESGF) Peer-to-Peer (P2P) enterprise system is a collaboration that develops, deploys and maintains software infrastructure for the management, dissemination, and analysis of model output and observational data. ESGF’s primary goal is to facilitate advancements in Earth System Science. ESGF P2P is a component architecture expressly designed to handle large-scale data management for worldwide distribution. The team of computer scientists and climate scientists has developed an operational system for serving climate data from multiple locations and sources. Model simulations, satellite observations, and reanalysis products are all being served from the ESGF P2P distributed data archive.
Empirical gramians can be computed for linear and nonlinear control systems for purposes of model order reduction (MOR), uncertainty quantification (UQ) or system identification (SYSID). Model reduction using empirical gramians can be applied to the state space, to the parameter space or to both through combined reduction. For state reduction the empirical controllability gramian and the empirical observability gramian, for balanced truncation, are available, or alternatively the empirical cross gramian for direct truncation. For parameter reduction, parameter identification and sensitivity analysis the empirical sensitivity gramian (controllability of parameters) or the empirical identifiability gramian (observability of parameters) are provided. Combined state and parameter reduction is enabled by the empirical joint gramian, which computes controllability and observability of states and parameter concurrently. The emgr framework is a compact open source toolbox for (empirical) GRAMIAN-based model reduction and compatible with OCTAVE and MATLAB.
A interpolation and encoding library for ECMWF data.
The Earth System Modeling Framework (ESMF) collaboration is building high-performance, flexible software infrastructure to increase ease of use, performance portability, interoperability, and reuse in climate, numerical weather prediction, data assimilation, and other Earth science applications. The ESMF defines an architecture for composing complex, coupled modeling systems and includes data structures and utilities for developing individual models.
ESMF Python interface.
A computer language devoted to elementary plane geometry. It aims to be a fairly comprehensive system to create geometric figures, either static or dynamic. Eukleides allows to handle basic types of data: numbers and strings, as well as geometric types of data: points, vectors, sets (of points), lines, circles and conics.
A Eukleides script usually consists in a declarative part where objects are defined, and a descriptive part where objects are drawn. Nevertheless, Eukleides is also a full featured programming language, providing conditional and iterative structures, user defined functions, modules, etc. Hence, it can easily be extended.
The Eukleides distribution mainly provides two interpreters: eukleides and euktopst. The former produces Encapsulated PostScript (EPS) files. It can also, using a converter, yield animated GIFs. The later produces PSTricks macros. It enables to include Eukleides figures into LaTeX documents.
A program for quickly and interactively computing with real and complex numbers and matrices, or with intervals, in the style of MatLab, Octave,… It can draw and animate your functions in two and three dimensions.
A software tool for detecting equations and hidden mathematical relationships in your data. Its goal is to identify the simplest mathematical formulas which could describe the underlying mechanisms that produced the data.
Empirical wavelet transform toolbox for Matlab.
A Fortran to CUDA (or C) compiler.
An extension module for Python which implements a optimized, register machine based interpreter, inside of your interpreter. You specify which functions you want Falcon to wrap (or your entire module), and Falcon takes over execution from there.
Falkon aims to enable the rapid and efficient execution of many tasks on large compute clusters, and to improve application performance and scalability using novel data management techniques.
Fast Artificial Neural Network Library is a free open source neural network library, which implements multilayer artificial neural networks in C with support for both fully connected and sparsely connected networks. Cross-platform execution in both fixed and floating point are supported. It includes a framework for easy handling of training data sets. It is easy to use, versatile, well documented, and fast. Bindings to more than 15 programming languages are available.
A data processing library that offers a set of searching functions supported by compressed bitmap indexes. The key technology underlying the FastBit software is a set of compressed bitmap indexes. In database systems, an index is a data structure to accelerate data accesses and reduce the query response time. Most of the commonly used indexes are variants of the B-tree, such as B+-tree and B*-tree. FastBit implements a set of alternative indexes called compressed bitmap indexes. Compared with B-tree variants, these indexes provide very efficient searching and retrieval operations, but are somewhat slower to update after a modification of an individual record.
Python bindings for FastBit.
A C++ library for hierarchical, agglomerative clustering. It provides a fast implementation of the most efficient, current algorithms when the input is a dissimilarity index. Moreover, it features memory-saving routines for hierarchical clustering of vector data. It improves both asymptotic time complexity (in most cases) and practical performance (in all cases) compared to the existing implementations in standard software: several R packages, MATLAB, Mathematica, Python with SciPy.
A Matlab program that implements the fast fixed-point algorithm for independent component analysis and projection pursuit. It features an easy-to-use graphical user interface, and a computationally powerful algorithm.
Fatiando a Terra
A Python toolkit for geophysical modeling and inversion.
FCM uses Subversion for code management but defines a common process and naming convention to simplify usage. It adds a layer on top of Subversion to provide a more natural and user-friendly interface. FCM features a powerful build system, mainly aimed at building modern Fortran software applications.
An API for manipulating, defining and analyzing geospatial information regardless of where it is stored. FDO uses a provider-based model for supporting a variety of geospatial data sources, where each provider typically supports a particular data format or data store.
Developing a new probabilistic model requires developing a representation for the model and a reasoning algorithm that can draw useful conclusions from evidence, which can be challenging tasks. Furthermore, it can be difficult to integrate a probabilistic model into a larger program.
Figaro is a probabilistic programming language that helps address both these issues. Figaro makes it possible to express probabilistic models using the power of programming languages, giving the modeler the expressive tools to create all sorts of models. Figaro comes with a number of built-in reasoning algorithms that can be applied automatically to new models. In addition, Figaro models are data structures in the Scala programming language, which is interoperable with Java, and can be constructed, manipulated, and used directly within any Scala or Java program.
The File Interpolation, Manipulation and EXtraction library for gridded geospatial data, written in C/C++. It converts between different, extensible dataformats (currently netcdf, NcML, grib1/2 and felt). It enables you to change the projection and interpolation of scalar and vector grids. It makes it possible to subset the gridded data and to extract only parts of the files.
The objective of the FLAME project is to transform the development of dense linear algebra libraries from an art reserved for experts to a science that can be understood by novice and expert alike. Rather than being only a library, the project encompasses a new notation for expressing algorithms, a methodology for systematic derivation of algorithms, Application Program Interfaces (APIs) for representing the algorithms in code, and tools for mechanical derivation, implementation and analysis of algorithms and implementations.
A software framework for instantiating high-performance BLAS-like dense linear algebra libraries.
A high performance dense linaer algebra library that is the result of the FLAME methodology for systematically developing dense linear algebra libraries.
Extends C++ for matrix/vector types ideally suited for numerical linear algebra.
An open source, general purpose, multi-phase computational fluid dynamics code capable of numerically solving the Navier-Stokes equation and accompanying field equations on arbitrary unstructured finite element meshes in one, two and three dimensions. It is used in a number of different scientific areas including geophysical fluid dynamics, computational fluid dynamics, ocean modelling and mantle convection. It uses a finite element/control volume method which allows arbitrary movement of the mesh with time dependent problems, allowing mesh resolution to increase or decrease locally according to the current simulated state. It has a wide range of element choices including mixed formulations. Fluidity is parallelised using MPI and is capable of scaling to many thousands of processors. Other innovative and novel features are a user-friendly GUI and a python interface which can be used to calculate diagnostic fields, set prescribed fields or set user-defined boundary conditions.
A software framework for supporting the efficient development, construction, execution, and scientific interpretation of atmospheric, oceanic, and climate system models.
An OpenMP runtime compatible with GCC 4.2, offering a structured way to efficiently execute OpenMP applications onto hierarchical (NUMA) architectures.
Marcel is a thread library that was originaly developped to meet the needs of the PM2 multithreaded environment. Marcel provides a POSIX-compliant interface and a set of original extensions. It can also be compiled to provide ABI-compabiblity with NTPL threads under Linux, so that multithreaded applications can use Marcel without being recompiled. Marcel features a two-level thread scheduler (also called N:M scheduler) that achieves the performance of a user-level thread package while being able to exploit multiprocessor machines. The architecture of Marcel was carefully designed to support a high number of threads and to efficiently exploit hierarchical architectures (e.g. multi-core chips, NUMA machines).
Elemental is open-source software for distributed-memory dense linear algebra.
The Family of Simplified Solver Interfaces is designed for an easy integration and selection of parallel solvers in Fortran codes which make use of compressed sparse row matrix format (CSR). FoSSI contains rather similar interfaces to the most popular and wide spread parallel solver libraries obtainable on the web: PETSC, HYPRE, AZTEC and MUMPS. Furthermore, an interface to the PILUT-library is included together with the PILUT-solver itself.
A phase-resolving, time-stepping Boussinesq model for ocean surface wave propagation in the nearshore. The present version of FUNWAVE is based on the MUSCLE-TVD finite volume scheme together with adaptive Runge Kutta time stepping. The code is parallelized using MPI and has been tested in linux and unix (Mac OS X) environments.
Enables simulation of Boussinesq or shallow water equations. CaFunwave is based on the Funwave.
A prognostic, unstructured-grid, finite-volume, free-surface, 3-D primitive equation coastal ocean circulation model developed by UMASSD-WHOI joint efforts. The model consists of momentum, continuity, temperature, salinity and density equations and is closed physically and mathematically using turbulence closure submodels. The horizontal grid is comprised of unstructured triangular cells and the irregular bottom is preseented using generalized terrain-following coordinates. The General Ocean Turbulent Model (GOTM) developed by Burchard’s research group in Germany (Burchard, 2002) has been added to FVCOM to provide optional vertical turbulent closure schemes.
The Geometric Algebra Algorithms Expression Templates library is a C++ library for evaluating geometric algebra expressions. It offers comfortable implementation and reasonable speed by using expression templates and metaprogramming techniques.
The basic idea of fast Geometric Algebra implementations is to do the grading operations beforehand, so only basic operations on the coordinates are performed at runtime. Gaalet does so by applying the grading operations with C metaprogramming techniques at compile time. These grading operations are incorporated into expression templates, also a metaprogramming technique, which offers C compilers a good starting point for code optimization as well as programmers the concept of lazy evaluation.
Gaalop (Geometic Algebra Algorithms Optimizer) is a software to optimize geometric algebra files. Algorithms can be developed by using the freely available CLUCalc software by Christian Perwass. Gaalop optimizes the algorithm and produces C++, OpenCL, CUDA, CLUCalc or LaTeX output (other output-formats will follow). The optimized code has no more geometric algebra operations and can be run very efficiently on various platforms.
A code generator for geometric algebra. Currently supported languages are C, C++, C# and Java.
A software package for the global numerical analysis of dynamical systems and optimization problems based on set oriented techniques. It may e.g. be used to compute invariant sets, invariant manifolds, invariant measures and almost invariant sets in dynamical systems and to compute the globally optimal solutions of both scalar and multiobjective problems.
A system that automatically executes "Galoized" serial C++ or Java code in parallel on shared-memory machines. It works by exploiting amorphous data-parallelism, which is present even in irregular codes that are organized around pointer-based data structures such as graphs and trees. The Galois system includes the Lonestar benchmark suite and the ParaMeter profiler.
Multicore processors are becoming increasingly the norm. As a result, we need to find ways to make it easier to write parallel programs. Galois allows the programmer to write serial C++ or Java code while still getting the performance of parallel execution. All the programmer has to do is use Galois-provided data structures, which are necessary for correct concurrent execution, and annotate which loops should be run in parallel. The Galois system then speculatively extracts as much parallelism as it can. The current release includes a dozen sample benchmarks applications from a broad range of domains that are written using the Galois extensions and classes.
A language-independent, low-level networking layer that provides network-independent, high-performance communication primitives tailored for implementing parallel global address space SPMD languages such as UPC, Titanium, and Co-Array Fortran. The interface is primarily intended as a compilation target and for use by runtime library writers (as opposed to end users), and the primary goals are high performance, interface portability, and expressiveness. GASNet stands for "Global-Address Space Networking".
Python bindings for GASNet.
An extension of the C programming language designed for high performance computing on large-scale parallel machines.The language provides a uniform programming model for both shared and distributed memory hardware. The programmer is presented with a single shared, partitioned address space, where variables may be directly read and written by any processor, but each variable is physically associated with a single processor. UPC uses a Single Program Multiple Data (SPMD) model of computation in which the amount of parallelism is fixed at program startup time, typically with a single thread of execution per processor.
An object-oriented geophysical and astrophysical spectral-element adaptive refinement code. Like most spectral-element codes, GASpAR combines finite-element efficiency with spectral-method accuracy. It is also designed to be flexible enough for a range of geophysics and astrophysics applications where turbulence or other complex multi-scale problems arise. The formalism accommodates both conforming and non-conforming elements.
A multi-purpose program for performing geometric algebra computations and visualizing geometric algebra.
Extensions to the SQLAlchemy framework to work with spatial databases. The support database systems include PostGIS and Spatialite.
A Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL. It provides a full suite of well known enterprise-level persistence patterns, designed for efficient and high-performing database access, adapted into a simple and Pythonic domain language.
GeoJModelBuilder couples geosprocessing Web services, NASA World Wind and Sensor Web services to support geoprocessing modeling and environmental monitoring.The main goal of GeoJModelBuilder is to bring an easy-to-use tool to the geoscientific community.
The tool can allow users to drag and drop various geospatial services to visually generate workflows and interact with the workflows in a virtual globe environment. It also allows users to audit trails of workflow executions, check the provenance of data products, and support scientific reproducibility.
The programming language used for the development is Java due to its platform-independent feature. The tool can be operated on any operating systems such as Windows or Unix/Linux that supports Java.
GeoLearn is designed to enable rapid processing of large size satellite remote sensing data available in HDF EOS format. It has been tested primarily with MODIS land-surface data products. Use and analysis of these datasets are at the heart of a variety of scientific investigations pertaining to the study of the interaction between land-surface and climate, and prediction of terrestrial hydrologic processes.
Adds spatial capabilities to scripting languages, e.g. Python.
A server written in Java that allows users to share and edit geospatial data. Designed for interoperability, it publishes data from any major spatial data source using open standards. Has a Python scripting interface.
GeoTemCo consists of several views showing the datas' several dimensions: a map view for the geospatial distribution of items, a time view for the temporal distribution of items and a detail view for the inspection of individual items.
A program for the solution of the partial differential equations describing fluid flow.
A 3D numerical model simulating the most important hydrodynamic and thermodynamic processes in natural waters. The model is general in the sense that it can be applied to various systems, scales and specifications. The model includes for example flooding and drying of tidal flats, flexible vertical and horizontal coordinate systems, different turbulence models integrated from GOTM.
A next-generation network shared file system, which will be an alternative solution of NFS, and will meet a demand for much larger, much reliable, and much faster file system.
The Generic Grid Generator generates dipole and tripole grids, and also supports editing the topography.
An efficient and portable "shared-memory" programming interface for distributed-memory computers.
Python Interface Doc: http://www.emsl.pnl.gov/docs/global/python/index.html
You can share and transfer files to/from a local machine—campus server, desktop computer or laptop—even if it’s behind a firewall and you do not have administrator privileges.
A generic library of C++ templates which implement universal Clifford algebras over the field of real numbers. Incorporates the PyClical extension module for Python. This gives users an easier Python scripting interface for calculations in Clifford algebras.
A Python library to explore relationships within and among related datasets.
GmtPy provides seamless integration of GMT plotting into Python programs. On top of that it provides (in an opt-in fashion): autoscaling, automatic tick increment determination, layout management, and more.
General NOAA Operational Modeling Environment, TNG.
A Python wrapper for the Google Chart API. The wrapper can render the URL of the Google chart, based on your parameters, or it can render an HTML img tag to insert into webpages on the fly. Made for dynamic python websites (Django,Zope,CGI,etc.) that need on the fly chart generation without any extra modules.
An API for the development of scalable, asynchronous and fault tolerant parallel applications.
An open-source dynamic JIT compilation framework for GPU compute applications targetinga range of GPU and non-GPU execution targets. Ocelot supports CUDA applications and provides animplementation of the CUDA Runtime API enabling seamless integration. NVIDIA’s PTX virtualinstruction set architecture is used as adevice-agnostic program representation that captures the data-parallel SIMT execution model ofCUDA applications. Ocelot supports several backend execution targets – a PTX emulator, NVIDIA GPUs,AMD GPUs, and a translator to LLVM for efficient execution of GPU kernels on multicore CPUs.
A collection of hundreds of Matlab scripts, many of which are useful for the geosciences.
A free and open source Geographic Information System (GIS) software suite used for geospatial data management and analysis, image processing, graphics and maps production, spatial modeling, and visualization.
A high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks.
A scalable data transfer management tool for GridFTP? transfer protocol. The goal is to manage as much as 1+ PB with millions of files transfers reliably.
A C code that provides a command line utility for non-interactive generation of multi-corner quasi-orthogonal grids inside simply connected polygonal regions.It is based on the CRDT algorithm that makes it possible to handle regions with elongated channels in a numerically robust way.
Provides C library functions and command line utilities for working with curvilinear grids. gridutils has been developed and used mainly for grids generated by gridgen, but can be used to handle arbitrary 2D quadrilateral simply connected multi-corner grids.
The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.
A Python interface to GSL.
A Python library for generating GUIs for easy dataset editing and display.
The GOCE User Toolbox GUT is a compilation of tools for the utilisation and analysis of GOCE Level 2 products. GUT supports applications in Geodesy, Oceanography and Solid Earth Physics. GUT is a tool to facilitate the use, viewing and post-processing of GOCE Level 2 mission data products for optimal use in the fields of geodesy, oceanography and solid Earth physics. GUT is a command-line processor that has been designed for users at all levels of expertise.
The Geophysical Wavelet Library (GWL) is a software package based on the continuous wavelet transform that allows to perform the direct and inverse continuous wavelet transform, 2C and 3C polarization analysis and filtering, modeling the dispersed and attenuated wave propagation in the time-frequency domain and optimization in signal and wavelet domains with the aim to extract velocities and attenuation parameters from a seismogram. The novelty of this package is that we incorporate the continuous wavelet transform into the library, where the kernel is the time-frequency polarization and dispersion analysis. This library has a wide range of potential applications in the field of signal analysis and may be particularly suitable in geophysical problems that we illustrate by analyzing synthetic, geomagnetic and real seismic data.
A modified version of the Hadoop MapReduce framework, designed to serve these applications. HaLoop not only extends MapReduce with programming support for iterative applications, but also dramatically improves their efficiency by making the task scheduler loop-aware and by adding various caching mechanisms. We evaluate HaLoop on real queries and real datasets and find that, on average, HaLoop reduces query runtimes by 1.85 compared with Hadoop, and shuffles only 4% of the data between mappers and reducers compared with Hadoop.
The ExaScale IO (ESIO) library provides simple, high throughput input and output of structured data sets using parallel HDF5. ESIO is designed to support reading and writing turbulence simulation restart files but it may be useful in other contexts. The library is written in C99 and may be used by C89 or C++ applications. A Fortran API built atop the F2003 standard ISO_C_BINDING is also available.
An HDF5 file editor.
A Virtual File Driver for HDF5 which uses parallel communication to transfer data between applications using the HDF5 IO API and a distributed shared memory (DSM) buffer.
A Python interface to HDF5.
A set of utilities for visualization and conversion of scientific data in the free, portable HDF5 format. Besides providing a simple tool for batch visualization as PNG images, h5utils also includes programs to convert HDF5 datasets into the formats required by other free visualization software (e.g. plain text, Vis5d, and VTK).
A visual tool for browsing and editing HDF4 and HDF5 files.
A high level interface to the Heirarchical Data Format, version 5, developed and maintained by the HDF group at the National Center for Supercomputing Applications (NCSA), at the University of Illinois at Urbana-Champaign. HDF5 is a file format designed for maximum flexibility and efficiency and it makes use of modern software technology. HDF5 sports such fundamental characteristics as platform independence and efficient built-in compression, and it can be used to store virtually any kind of scientific data. HL-HDF is designed to focus on selected HDF5 functionality and make it available to users at a high level of abstraction to facilitate data management. This distribution contains HL-HDF source code and associated documentation. The first version also comes prebuilt for a multitude of platforms.
The H5hut library is an implementation of several data models for particle-based simulations that encapsulates the complexity of parallel HDF5 and is simple to use, yet does not compromise performance. H5hut is tuned for writing collectively from all processors to a single, shared file. Although collective I/O performance is typically (but not always) lower than that of file-per- processor, having a shared file simplifies scientific workflows in which simulation data needs to be analyzed or visualized. In this scenario, the file-per-processor approach leads to data management headaches because large collections of files are unwieldy to manage from a file system standpoint. On a parallel file system like Lustre, even the ls utility will break when presented with tens of thousands of files, and performance begins to degrade with this number of files because of contention at the metadata server. Often a post-processing step is necessary to refactor file-per-processor data into a format that is readable by the analysis tool. In contrast, H5hut files can be directly loaded in parallel by visualization tools like VisIt and ParaView. H5hut is a veneer API for HDF5: H5hut files are also valid HDF5 files and are compatible with other HDF5-based interfaces and tools. For example, the h5dump tool that comes standard with HDF5 can export H5hut files to ASCII or XML for additional portability. H5hut also includes tools to convert H5hut data to the Visualization ToolKit (VTK) format and to generate scripts for the GNUplot data plotting tool.
An unstructured, high-order, parallel Discontinuous Galerkin (DG) code that I am developing as part my PhD project. hedge’s design is focused on two things: being fast and easy to use. While the need for speed dictates implementation in a low level language, these same low-level languages become quite cumbersome at a higher level of abstraction. This is where the "h" in hedge comes from; it takes a hybrid approach. While a small core is written in C++ for speed, all user-visible functionality is driven from Python.
A C++ library for rapid development of adaptive hp-FEM / hp-DG solvers. Novel hp-adaptivity algorithms help solve a large variety of problems ranging from ODE and stationary linear PDE to complex time-dependent nonlinear multiphysics PDE systems.
A multi-purpose finite element software providing powerful tools for efficient and accurate solution of a wide range of problems modeled by partial differential equations (PDEs). Based on object-oriented concepts and the full capabilities of C++ the HiFlow³ project follows a modular and generic approach for building efficient parallel numerical solvers. It provides highly capable modules dealing with the mesh setup, finite element spaces, degrees of freedom, linear algebra routines, numerical solvers, and output data for visualization. Parallelism – as the basis for high performance simulations on modern computing systems – is introduced on two levels: coarse-grained parallelism by means of distributed grids and distributed data structures, and fine-grained parallelism by means of platform-optimized linear algebra back-ends.
A C++ software package for discontinuous Galerkin method. This framework is intended to those who want to easily develop and apply discontinuous Galerkin methods for various physical problems, especially partial differential equations, arising from fluid mechancis and electro-magnetism. Using HPGEM, one can numerically solve the simplest class room examples such as linear advection and Burgers equations to the most complicated practical examples such as shallow water, Euler, Navier-Stokes and Maxwell equations.
A general purpose C++ runtime system for parallel and distributed applications of any scale. The HPX runtime software package is a modular, feature-complete, and performance oriented representation of the ParalleX execution model targeted at conventional parallel computing architectures such as SMP nodes and commodity clusters. HPX is a C++ library that supports a set of critical mechanisms for dynamic adaptive resource management and lightweight task scheduling within the context of a global address space.
A comprehensive navigational query language for relational databases. HTSQL is designed for data analysts and other accidental programmers who have complex business inquiries to solve and need a productive tool to write and share database queries.
A software platform for creating dynamic web sites that support scientific research and educational activities.
The Ibis Portability layer (IPL) is a communication library specifically designed for usage in a grid environment. It has a number of properties which help to achieve its goal of providing programmers with an easy to use, reliable grid communication infrastructure.
A Java-based software framework for analyzing and visualizing geoscience data. The IDV "reference application" is a geoscience display and analysis software system with many of the standard data displays that other Unidata software (e.g. GEMPAK and McIDAS) provide. It brings together the ability to display and work with satellite imagery, gridded data (for example, numerical weather prediction model output), surface observations, balloon soundings, NWS WSR-88D Level II and Level III RADAR data, and NOAA National Profiler Network data, all within a unified interface. It also provides 3-D views of the earth system and allows users to interactively slice, dice, and probe the data, creating cross-sections, profiles, animations and value read-outs of multi-dimensional data sets. The IDV can display any Earth-located data if it is provided in a known format.
A free, open source, visualization and data analysis software package that is the fifth generation in SSEC’s 40 year history of sophisticated McIDAS (Man computer Interactive Data Access System) software packages. McIDAS-V displays weather satellite (including hyperspectral) and other geophysical data in 2- and 3-dimensions, and can be used to analyze and manipulate the data with its powerful mathematical functions.
A Python library for defining domain specific languages and generating high performance code.
A high performance math library for programmers and scientists. Extending the .NET framework with tools needed for scientific computing, it simplifies the implementation of all kinds of numerical algorithms in convenient, familiar C#-syntax – optimized to the speed of C and FORTRAN.
A public domain Java image processing program. It can display, edit, analyze, process, save and print 8-bit, 16-bit and 32-bit images. It can read many image formats including TIFF, GIF, JPEG, BMP, DICOM, FITS and "raw". It supports "stacks", a series of images that share a single window. It is multithreaded, so time-consuming operations such as image file reading can be performed in parallel with other operations.
ImageJ was designed with an open architecture that provides extensibility via Java plugins. Custom acquisition, analysis and processing plugins can be developed using ImageJ’s built in editor and Java compiler. User-written plugins make it possible to solve almost any image processing or analysis problem.
A distribution of ImageJ (and soon ImageJ2) together with Java, Java 3D and a lot of plugins organized into a coherent menu structure. Fiji compares to ImageJ as Ubuntu compares to Linux.
A project to combine VTK with ImageJ.
A project to develop the next-generation version of ImageJ.
The motivation for developing ImageTools (formerly known as Im2Learn) comes from academic, government and industrial collaborations that involve development of new computer methods and solutions for understanding complex data sets. Images and other types of data generated by various instruments and sensors form complex and highly heterogeneous data sets, and pose challenges on knowledge extraction.
The main goal of the ImageTools research and development is to automate information processing of repetitive, laborious and tedious analysis tasks and build user-friendly decision-making systems that operate in automated or semi-automated mode in a variety of applications. The development is based on theoretical foundations of image and video processing, computer vision, data fusion, statistical and spectral modeling.
A powerful numerical tool for academic research. It can combine the versatility of industrial codes with the accuracy of spectral codes. Thank to a very successful project with NAG and HECToR (UK Supercomputing facility), Incompact3d can be used on up to hundreds of thousands computational cores to solve the incompressible Navier-Stokes equations. This high level of parallelisation is achieved thank to a highly scalable 2D decomposition library and a distributed Fast Fourier Transform (FFT) interface.
A distributed operating system, originally developed at Bell Labs, but now developed and maintained by Vita Nuova® as Free Software. Applications written in Inferno’s concurrent programming language, Limbo, are compiled to its portable virtual machine code (Dis), to run anywhere on a network in the portable environment that Inferno provides. Unusually, that environment looks and acts like a complete operating system.
The use of a high-level language and virtual machine is sensible but mundane. The interesting thing is the system’s representation of services and resources. They are represented in a file-like name hiearchy. Programs access them using only the file operations open, read/write, and close. The files may of course represent stored data, but may also be devices, network and protocol interfaces, dynamic data sources, and services. The approach unifies and provides basic naming, structuring, and access control mechanisms for all system resources. A single file-service protocol (the same as Plan 9’s 9P) makes all those resources available for import or export throughout the network in a uniform way, independent of location. An application simply attaches the resources it needs to its own per-process name hierarchy (name space).
The system can be used to build portable client and server applications. It makes it straightforward to build lean applications that share all manner of resources over a network, without the cruft of much of the Grid software one sees.
The inspyred library grew out of insights from Ken de Jong’s book “Evolutionary Computation: A Unified Approach.” The goal of the library is to separate problem-specific computation from algorithm-specific computation. Any bio-inspired algorithm has at least two aspects that are entirely problem-specific: what solutions to the problem look like and how such solutions are evaluated. These components will certainly change from problem to problem. For instance, a problem dealing with optimizing the volume of a box might represent solutions as a three-element list of real values for the length, width, and height, respectively. In contrast, a problem dealing with optimizing a set of rules for escaping a maze might represent solutions as a list of pair of elements, where each pair contains the two-dimensional neighborhood and the action to take in such a case.
On the other hand, there are algorithm-specific components that may make no (or only modest) assumptions about the type of solutions upon which they operate. These components include the mechanism by which parents are selected, the way offspring are generated, and the way individuals are replaced in succeeding generations. For example, the ever-popular tournament selection scheme makes no assumptions whatsoever about the type of solutions it is selecting. The n-point crossover operator, on the other hand, does make an assumption that the solutions will be linear lists that can be “sliced up,” but it makes no assumptions about the contents of such lists. They could be lists of numbers, strings, other lists, or something even more exotic.
The central design principle for inspyred is to separate problem-specific components from algorithm-specific components in a clean way so as to make algorithms as general as possible across a range of different problems.
Invenio is a free software suite enabling you to run your own digital library or document repository on the web. The technology offered by the software covers all aspects of digital library management from document ingestion through classification, indexing, and curation to dissemination. Invenio complies with standards such as the Open Archives Initiative metadata harvesting protocol (OAI-PMH) and uses MARC 21 as its underlying bibliographic format. The flexibility and performance of Invenio make it a comprehensive solution for management of document repositories of moderate to large sizes (several millions of records).
A scalable, unified high-end computing I/O forwarding software layer.
An integration middleware for the Internet of Things. It provides a communication stack for embedded devices based on IPv6, Web services and oBIX to provide interoperable interfaces for smart objects. Using 6LoWPAN for constrained wireless networks and the Constrained Application Protocol together with Efficient XML Interchange an efficient stack is provided allowing using interoperable Web technologies in the field of sensor and actuator networks and systems while remaining nearly as efficient regarding transmission message sizes as existing automation systems. The IoTSyS middleware aims providing a gateway concept for existing sensor and actuator systems found in nowadays home and building automation systems, a stack which can be deployed directly on embedded 6LoWPAN devices and further addresses security, discovery and scalability issues.
A Gallery of Interesting Python Notebooks
Stores IPython notebooks automagically onto OpenStack clouds through Swift.
A package for runing R code within IPython.
A python library for meteorology and climatology.
Reference Guide: http://scitools.org.uk/iris/docs/latest/iris/iris.html
The integrated Rule-Oriented Data-management System, a community-driven, open source, data grid software solution. It helps researchers, archivists and others manage (organize, share, protect, and preserve) large sets of computer files. Collections can range in size from moderate to a hundred million files or more totaling petabytes of data. This is the open-source successor to SRB.
A global geometric framework for nonlinear dimensionality reduction. A Matlab package is available.
A cross-platform computational fluid dynamics (CFD) library for mesh-free particle based simulation and visualization of incompressible flows using Smoothed Particle Hydrodynamics (SPH) methods. The library is open source and cross-platform, written in pure C++ and the new standard for parallel programming of modern processors - OpenCL. The library will make full use of GPUs, CPUs and other OpenCL enabled devices in running system to accelerate the computing.
A Tile Assembly Model simulator that allows users to design tilesets and seeds and to simulate assemblies. The simulator allows for graphical creation of seed assemblies, fast forwarding and rewinding of assembly growth, and easy zooming, scrolling, and inspection of assemblies among other features. The graphical tile type editor allows tile types to be easily designed and manipulated. Assemblies and tile sets can be created, saved, and reloaded.
Our research is motivated by the prospect, raised by pioneering work of Seeman, Winfree, and Rothemund, of engineering structures that autonomously assemble themselves from molecular components. We are primarily interested in understanding the power and limitations of this "programming of matter". Our work includes the development and analysis of mathematical models of self-assembly, the creation and use of software environments for developing and simulating self-assembly systems, and studies of the self-assembly of fractals and other complex structures. We also work to adapt methods that software engineers have developed for creating, controlling, and reasoning about systems of immense complexity (requirements engineering, programming languages, formal verification, software safety, …) to the even greater challenges that nanotechnology will confront.
A C++ library of mathematical, signal processing and communication classes and functions. Its main use is in simulation of communication systems and for performing research in the area of communications. The kernel of the library consists of generic vector and matrix classes, and a set of accompanying routines. Such a kernel makes IT++ similar to MATLAB, GNU Octave or SciPy.
Technologies that enable application scientists to easily use multiple mesh and discretization strategies within a single simulation on petascale computers.
Python bindings for ITAPS interfaces.
A code library which provides geometry functionality used for mesh generation and other applications. This functionality includes that commonly found in solid modeling engines, like geometry creation, query and modification; CGMA also includes capabilities not commonly found in solid modeling engines, like geometry decomposition tools and support for shared material interfaces.
A library of mesh generation functionality.
A component for representing and evaluating mesh data. MOAB implements the ITAPS iMesh interface; iMesh is a common interface to mesh data implemented by several different packages, including MOAB. Various tools like smoothing, adaptive mesh refinement, and parallel mesh communication are implemented on top of iMesh.
J is a modern, high-level, general-purpose, high-performance programming language. J is particularly strong in the mathematical, statistical, and logical analysis of data. It is a powerful tool in building new and better solutions to old problems and even better at finding solutions where the problem is not already well understood.
JHOVE2 is a framework and application for next-generation format-aware characterization of digital objects. The function of JHOVE2 is encapsulated in a series of modules that can be configured for use within the framework’s plug-in architecture. The NetCDF Formatmodule, denominated JANEME: J-NetCDF Metadata Extractor, provides characterization services for the netCDF family of formats consisting of the profiles netCDF-3 and netCDF-4 and for the GRIB family (GRIB 1.0 and 2.0) as well.
JANEME is able to parse and characterize files in NetCDF and GRIB format via the Unidata netcdf-java library 4.1 (Unidata NetCDF-java) and to fill out templates conforming to Dublin Core and a c3grid iso19115 compatible profile with the extracted metadata while supporting JHOVE`s standard output as well. Additionally, it supplies an axis2 web service deployable on any arbitrary Java Application Server, i.e., Tomcat.
A set of Matlab functions for the purpose of analyzing data. It consists of four hundred m-files spanning thirty-five thousand lines of code. JLAB includes functions ranging in complexity from one-line aliases to high-level algorithms for certain specialized tasks. About four hundred automated tests and dozens of scripts for sample figures help keep things organized.
Jarray - Vector, matrix, and N-D array tools. Jmath - Mathematical aliases and basic functions. Jpoly - Special polynomials, matrices, and functions. Jgraph - Fine-turning and customizing figures. Jstrings - Strings, files, and variables. Jstats - Statistical tools and probability distributions. Jsignal - Signal processing, wavelet and spectral analysis. Jellipse - Elliptical (bivariate) time series analysis. Jcell - Tools for operating on cell arrays of numerical arrays. Vtools - Operations on multiple data arrays simultaneously.
A high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The library, largely written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing.
The Kepler Project is dedicated to furthering and supporting the capabilities, use, and awareness of the free and open source, scientific workflow application, Kepler. Kepler is designed to help scientists, analysts, and computer programmers create, execute, and share models and analyses across a broad range of scientific and engineering disciplines. Kepler can operate on data stored in a variety of formats, locally and over the internet, and is an effective environment for integrating disparate software components, such as merging "R" scripts with compiled "C" code, or facilitating remote, distributed execution of models. Using Kepler’s graphical user interface, users simply select and then connect pertinent analytical components and data sources to create a "scientific workflow"—an executable representation of the steps required to generate results. The Kepler software helps users share and reuse data, workflows, and components developed by the scientific community to address common needs.
An auto-parallelizing Fortran/C compiler for NVIDA GPUs.
KGPU is a GPU computing framework for the Linux kernel. It allows Linux kernel to call CUDA programs running on GPUs directly. The motivation is to augment operating systems with GPUs so that not only userspace applications but also the operating system itself can benefit from GPU acceleration. It can also free the CPU from some computation intensive work by enabling the GPU as an extra computing device.
A user-friendly graphical workbench for the entire analysis process: data access, data transformation, initial investigation, powerful predictive analytics, visualisation and reporting. The open integration platform provides over 1000 modules (nodes), including those of the KNIME community and its extensive partner network.
KNIME can be downloaded onto the desktop and used free of charge. KNIME products include additional functionalities such as shared repositories, authentication, remote execution, scheduling, SOA integration and a web user interface as well as world-class support. Robust big data extensions are available for distributed frameworks such as Hadoop.
An Open-Source framework for the implementation of numerical methods for the solution of engineering problems. It is written in C++ and is designed to allow collaborative development by large teams of researchers focusing on modularity as well as on performance. The Kratos features a "core" and "applications" approach where "standard tools" (databases, linear algebra, search structures, etc…) come as a part of the core and are available as building blocks in the development of "applications" which focus on the solution of the problems of interest. Its ultimate goal is to simplify the development of new numerical methods.
An extensible XSLT-based framework for extracting RDF from XML, supporting multiple input languages as well as multiple output RDF notations.
Creates PNG images of mathematical expressions formatted in LaTeX. While it can convert a whole LaTeX document, it is designed to easily generate images from just a fragment of LaTeX code. It depends on other software: latex, dvips, and convert. (The last one is from the ImageMagick graphics toolset.) If you already work with LaTeX on a modern Unix or Linux system, you probably already have all of that installed.
To calculate backward-in-time, finite-size Lyapunov exponents (FSLEs) of the global oceans.
A signal processing oriented command language with matlab-like syntax which includes a high level object-oriented graphic language. It allows to deal with high-level structures such as signals, images, wavelet transforms, extrema representation, short time fourier transform, etc.
A comprehensive LaTeX to XML converter. Programs to convert LaTeX into XML, and the resulting XML into HTML or XHTML.
The Local Ensemble Transform Kalman Filter is an advanced data assimilation method for many possible applications.
Much computational science deals with the approximate solution of models described by systems of partial differential equations; these are used across the entire breadth of the quantitative sciences. Such a model takes as input the physical state at some initial time and runs forward in time to compute the state at some later time of interest; that is, it maps cause to effect, and so it is referred to as the forward model. For a given forward model, one can associate an adjoint model, which does the opposite: it maps from effect back to cause, and so runs backwards in time. Once an adjoint model is available, it makes possible a number of very powerful techniques: optimise engineering designs, assimilate data from physical measurements, estimate unknown parameters in the forward model, and estimate the approximation error in quantities of interest. Such applications are of huge interest and importance across all of engineering and the quantitative sciences. As computational science moves from mere simulation to optimisation, adjoint modelling will only grow in importance.
The fundamental abstraction of algorithmic differentiation is that it treats the model as a sequence of primitive instructions, each of which may be differentiated in turn and composed using the chain rule. libadjoint explores a similar, but higher-level abstraction: that the model is a sequence of linear solves. In this approach, the model is instrumented with library calls that record what operators it is assembling and what they are being applied to, in an analogous manner to building a tape for reverse-mode AD. The model developer then provides callback routines that compute the action of or assemble these operators to the library. With this information, the library may then assemble the adjoint of each equation solved in the forward model automatically. This promises to make adjointing models significantly easier than it currently is.
A library for state-space modelling and Bayesian inference on high-performance computer hardware, including multi-core CPUs, many-core GPUs (graphics processing units) and distributed-memory clusters. The staple methods of LibBi are based on sequential Monte Carlo (SMC), also known as particle filtering. These methods include particle Markov chain Monte Carlo (PMCMC) and SMC2. Other methods include the extended Kalman filter and some parameter optimisation routines. LibBi consists of a C++ template library, as well as a parser and compiler, written in Perl, for its own modelling language.
A C/C++ library for reading and writing the very common LAS LiDAR format. The ASPRS LAS format is a sequential binary format used to store data from LiDAR sensors and by LiDAR processing software for data interchange and archival.
A LAS reader plugin for ParaView.
An auto-parallelizing library to speed up your stencil code based computer simulations. It runs on virtually all current architectures, be it multi-cores, GPUs, or large scale MPI clusters.
A project to provide an implementation of the GDAL specification within the the HDF5 file format. Specifically, the format will support raster attribute tables (commonly not included within other formats), image pyramids, GDAL meta-data, in-built statistics while also providing large file handling with compression used throughout the file. Being based on the HDF5 standard, it will also provide a base from which other formats could be derived and will be a good choice for long term data archiving. An independent software library (libKEA) has been provided through which complete access to the KEA image format is provided alongside a GDAL driver allowing KEA images to be used through any GDAL supported software.
The libMesh library provides a framework for the numerical simulation of partial differential equations using arbitrary unstructured discretizations on serial and parallel platforms. A major goal of the library is to provide support for adaptive mesh refinement (AMR) computations in parallel while allowing a research scientist to focus on the physics they are modeling.
libMesh currently supports 1D, 2D, and 3D steady and transient simulations on a variety of popular geometric and finite element types. The library makes use of high-quality, existing software whenever possible. PETSc or the Trilinos Project are used for the solution of linear systems on both serial and parallel platforms, and LASPack is included with the library to provide linear solver support on serial machines. An optional interface to SLEPc is also provided for solving both standard and generalized eigenvalue problems.
A C++ parallel framework for the multiscale coupling methods dedicated to material simulations. This framework is designed with the form of a library providing an API which makes it possible to program coupled simulations. At the present time, stable implemented coupling method is based on Bridging Method. The coupled parts can be provided by existing projects. In such a manner, the API gives C++ templated interfaces to reduce to the maximum the cost of integration taking the form of plugins or alike. Such codes have been integrated to provide a functional prototype of the framework. For example, molecular dynamics software that have been integrated is Stamp (a code of the CEA) and Lammps (Sandia laboratories). The unique software of continuum mechanics, discretized by finite elements, is based on the libMesh framework.
A C++ library computing a principal component analysis plus corresponding transformations. This requires the Armadillo library.
Library for spherical harmonic transforms. A collection of algorithms for efficient conversion between maps on the sphere and their spherical harmonic coefficients. It supports a wide range of pixelisations (including HEALPix, GLESP, and ECP).
A software tool for creating multiphysics simulation codes.
A DSL for solving mesh-based PDEs.
A platform for development and distribution of scientific software.
The The Large Time/Frequency Analysis Toolbox (LTFAT) is a Matlab/Octave toolbox for working with time-frequency analysis and synthesis. It is intended both as an educational and a computational tool. The toolbox provides a large number of linear transforms including Gabor and wavelet transforms along with routines for constructing windows (filter prototypes) and routines for manipulating coefficients.
An extended version of pdfTeX using Lua as an embedded scripting language. The LuaTeX project’s main objective is to provide an open and configurable variant of TeX while at the same time offering downward compatibility. LuaTeX uses Unicode (as UTF-8) as its default input encoding, and is able to use modern (OpenType) fonts (for both text and mathematics).
A Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Locally Weighted Projection Regression (LWPR) is a recent algorithm that achieves nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it uses locally linear models, spanned by a small number of univariate regressions in selected directions in input space. A locally weighted variant of Partial Least Squares (PLS) is employed for doing the dimensionality reduction.
A Python version is available.
An open-source software package for multidimensional data analysis and reproducible computational experiments.
The latest generation of the ECMWF’s Meteorological plotting software MAGICS. Although completely redesigned in C, it is intended to be as backwards-compatible as possible with the Fortran interface. The contour package was rewritten and no longer depends on the CONICON licence. Besides its programming interfaces (Fortran and C), Magics\ offers MagML, a plot description language based on XML. Magics++ supports the plotting of contours, wind fields, observations, satellite images, symbols, text, axis and graphs (including boxplots). Data fields to be plotted may be presented in various formats, for instance GRIB 1 and 2 code data, gaussian grid, regularly spaced grid and fitted data. GRIB data is handled via ECMWF’s GRIB API software. Input data can also be in BUFR and NetCDF format or retrieved from an ODB database. The produced meteorological plots can be saved in various formats, such as PostScript, EPS, PDF, GIF, PNG and SVG.
Matrix algebra on GUP and multicore architectures. The MAGMA project aims to develop a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures, starting with current Multicore+GPU systems.
MASA (Manufactured Analytical Solution Abstraction) is a library written in C++ (with C and Fortran90 interfaces) which provides a suite of manufactured solutions for the software verification of partial differential equation solvers in multiple dimensions. MASA provides two methods to import manufactured solutions into the library. Users can either generate their own source terms, or they can use the automatic differentiation capabilities provided in MASA. The method by which solutions can be added to is provided by the "MASA-import" script.
A free software library written to perform vectorized scientific computing and to be as compatible as possible with both GNU Octave and Matlab computing frameworks, offering general purpose, portable and freely available features for the scientific community. Mastrave is mostly oriented to ease complex modelling tasks such as those typically needed within environmental models, even when involving irregular and heterogeneous data series.
Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell, web application servers, and six graphical user interface toolkits.
A library of high-level functions that facilitate making informative and attractive plots of statistical data using matplotlib. It also provides concise control over the aesthetics of the plots, improving on matplotlib’s default look.
An on-the-fly calculator for Monte Carlo methods that uses latin-hypercube sampling (see soerp for the Python implementation of the analytical second-order error propagation original Fortran code SOERP by N. D. Cox) to perform non-order specific error propagation (or uncertainty analysis). The mcerp package allows you to easily and transparently track the effects of uncertainty through mathematical calculations. Advanced mathematical functions, similar to those in the standard math module can also be evaluated directly.
The MATLAB Compiler Runtime (MCR) is a standalone set of shared libraries that enables the execution of compiled MATLAB applications or components on computers that do not have MATLAB installed. When used together, MATLAB, MATLAB Compiler, and the MCR enable you to create and distribute numerical applications or software components quickly and securely.
The Modular toolkit for Data Processing is a collection of supervised and unsupervised learning algorithms and other data processing units that can be combined into data processing sequences and more complex feed-forward network architectures. The base of available algorithms is steadily increasing and includes signal processing methods (Principal Component Analysis, Independent Component Analysis, Slow Feature Analysis), manifold learning methods ([Hessian] Locally Linear Embedding), several classifiers, probabilistic methods (Factor Analysis, RBM), data pre-processing methods, and many others.
A set of software tools for data acquisition and storage and a methodology for management of complex scientific data. MDSplus allows all data from an experiment or simulation code to be stored into a single, self-descriptive, hierarchical structure. The system was designed to enable users to easily construct complete and coherent data sets. The MDSplus programming interface contains only a few basic commands, simplifyng data access even into complex structures. Using the client/server model, data at remote sites can be read or written without file transfers. MDSplus includes x-windows and java tools for viewing data or for modifying or viewing the underlying structures.
A free software media publishing platform that anyone can run. You can think of it as a decentralized alternative to Flickr, YouTube, SoundCloud, etc.
A software extension for MediaWiki that makes it into a powerful environment for collaborating on publication-quality manuscripts and software projects.
Meshing to Realistic Domains
Software for generating properly closed 2d ocean domains for both global and regional simulations.
An open source, portable, and extensible system for the processing and editing of unstructured 3D triangular meshes. The system is aimed to help the processing of the typical not-so-small unstructured models arising in 3D scanning, providing a set of tools for editing, cleaning, healing, inspecting, rendering and converting this kind of meshes.
The Metadata Gathering, Extraction and Transformation Application is a Python application for discovering and extracting metadata from spatial raster datasets (metadata crawler) and transforming it into xml (metadata transformation). A number of generic and specialised imagery formats are supported. The format support has a plugin architecture and more formats can easily be added.
A meteorological workstation application designed to be a complete working environment for both the operational and research meteorologist. Its capabilities include powerful data access, processing and visualisation. It features a powerful icon-based user interface for interactive work, and a scripting language for batch processing. The two are linked through the ability to automatically convert icons into their equivalent script code.
A domain specific language (DSL) devoted to the simulation of biological processes, especially those whose state space must be computed jointly with the current state of the system. MGS embeds the idea of topological collections and their transformations into the framework of a simple dynamically typed functional language. Collections are just new kinds of values and transformations are functions acting on collections and defined by a specific syntax using rules. MGS is an applicative programming language: operators acting on values combine values to give new values, they do not act by side-effect.
An open source toolkit for students and researchers in power systems. It is designed to make working with ED, OPF, and UC problems simple and intuitive. The goal is to foster collaboration with other researchers and to make learning easier for students.
A unikernel for constructing secure, high-performance network applications across a variety of cloud computing and mobile platforms. Code can be developed on a normal OS such as Linux or MacOS X, and then compiled into a fully-standalone, specialised microkernel that runs under the Xen hypervisor. Since Xen powers most public cloud computing infrastructure such as Amazon EC2, this lets your servers run more cheaply, securely and finer control than with a full software stack.
Mirage is based around the OCaml language, with syntax extensions and libraries which provide networking, storage and concurrency support that are easy to use during development, and map directly into operating system constructs when being compiled for production deployment. The framework is fully event-driven, with no support for preemptive threading.
Implementing and consuming Machine Learning techniques at scale are difficult tasks for ML Developers and End Users. MLbase is a platform addressing the issues of both groups, and consists of three components: MLlib, MLI, ML Optimizer.
A scalable C++ machine learning library with Python bindings.
A Python module for machine learning. It provides a wide range of state-of-the-art machine learning methods for supervised and unsupervised problems and it is aimed at finding a reasonable compromise among modularity, maintainability, reproducibility, usability and efficiency.
A parallelized Python library for finding modal decompositions and reduced-order models.
A general purpose statistical data-visualization system. It features outstanding interactive visualization techniques for data of almost any kind.
The goal of Matrices Over Runtime Systems at Exascale (MORSE) project is to design dense and sparse linear algebra methods that achieve the fastest possible time to an accurate solution on large-scale multicore systems with GPU accelerators, using all the processing power that future high end systems can make available. To develop software that will perform well on petascale and exascale systems with thousands of nodes and millions of cores, several daunting challenges have to be overcome, both by the numerical linear algebra and the runtime system communities. By designing a research framework for describing linear algebra algorithms at a high level of abstraction,the MORSE team will enable the strong collaboration between research groups in linear algebra and runtime systems needed to develop methods and libraries that fully benefit from the potential of future large-scale machines. Our project will take a pioneering step in the effort to bridge the immense software gap that has opened up in front of the High-Performance Computing (HPC) community.
MOdular library for raSter bAsed hydrologIcal appliCatiOn.
The Model for Prediction Across Scales (MPAS) is a collaborative project for developing atmosphere, ocean and other earth-system simulation components for use in climate, regional climate and weather studies.
The defining features of MPAS are the unstructured Voronoi meshes and C-grid discretization used as the basis of the model components. The unstructured Voronoi meshes, formally Spherical Centriodal Voronoi Tesselations (SVCTs), allow for both quasi-uniform discretization of the sphere and local refinement. The C-grid discretization, where the normal component of velocity on cell edges is prognosed, is especially well-suited for higher-resolution, mesoscale atmosphere and ocean simulations.
The MIT Multidisciplinary Simulation, Estimation, and Assimilation Systems (MSEAS) group creates, develops and utilizes new mathematical models and computational methods for ocean predictions and dynamical diagnostics, for optimization and control of autonomous ocean observation systems, and for data assimilation and data-model comparisons. Our systems are used for basic and fundamental research and for realistic simulations and predictions in varied regions of the world’s ocean.
The Manifold Toolkit provides easy mechanisms to enable arbitrary algorithms to operate on manifolds. The main application is the use of 3D rotations SO(3), as well as the construction of compound manifolds from arbitrary combinations of sub-manifolds. We also provide a refactored version of the previously released SLoM framework which implements Gauss-Newton and Levenberg-Marquardt-based sparse least-squares optimization on manifolds and a port of MTK to Matlab (MTKM).
The Mimetic Methods Toolkit is a general purpose API for computer simulation of physical phenomena based on Mimetic Discretization Methods. It allows the user to develop numerical models that satisfy physical conservation laws, while preserving even order of accuracy, up to the boundary of the considered domain. A Python wrapper is available.
A Fortran 90 Library containing different subroutines to estimate the Power Spectral Density of real time series.
Python bindings for MTSPEC.
A Python extension package that provides three new objects, DateTime, DateTimeDelta and RelativeDateTime, which let you store and handle date/time values in a much more natural way than by using ticks (seconds since 1.1.1970 0:00 UTC), the representation used by Python’s time module. You can add, subtract and even multiply instances, pickle and copy them and convert the results to strings, COM dates, ticks and some other more esoteric values. In addition, there are several convenient constructors and formatters at hand to greatly simplify dealing with dates and times in real-world applications. In addition to providing an easy-to-use Python interface the package also exports a comfortable C API interface for other Python extensions to build upon.
Provides an easy to use, high-performance, reliable and robust Python interface to ODBC compatible databases such as MS SQL Server and MS Access, Oracle Database, IBM DB2 and Informix , Sybase ASE and Sybase Anywhere, MySQL, PostgreSQL, SAP MaxDB and many more. ODBC refers to Open Database Connectivity and is the industry standard API for connecting applications to databases. In order to facilitate setting up ODBC connectivity, operating systems typically provide ODBC Managers which help set up the ODBC drivers and manage the binding of the applications against these drivers. On Windows and Mac OS X the ODBC Manager is built into the system. On Unix platforms, you can choose one of the ODBC managers unixODBC, iODBC or DataDirect, which provide the same ODBC functionality on most Unix systems.
The mystic framework provides a collection of optimization algorithms and tools that allows the user to more robustly (and readily) solve optimization problems. All optimization algorithms included in mystic provide workflow at the fitting layer, not just access to the algorithms as function calls. Mystic gives the user fine-grained power to both monitor and steer optimizations as the fit processes are running.
Where possible, mystic optimizers share a common interface, and thus can be easily swapped without the user having to write any new code. Mystic solvers all conform to a solver API, thus also have common method calls to configure and launch an optimization job. For more details, see mystic.abstract_solver. The API also makes it easy to bind a favorite 3rd party solver into the mystic framework.
By providing a robust interface designed to allow the user to easily configure and control solvers, mystic reduces the barrier to implementing a target fitting problem as stable code. Thus the user can focus on building their physical models, and not spend time hacking together an interface to optimization code.
A tool for extracting isosurfaces from oceanographic simulation output, such as from ROMS or HOPS. It also has the ability to compute depth-adjusted means and standard deviations, so that statistical isosurfaces (such as temperature relative to the depth-adjusted mean) may be generated.
A program for exploring longitude/latitude based data stored in NetCDF file format. Ncvtk is built on top of the VTK toolbox. Ncvtk has been designed with the aim of offering a high degree of interactivity to scientists who have a need to explore three-dimensional, time-dependent planetary data. The input data should be stored in a NetCDF file and the metadata should loosely follow the CDC convention. In particular, we support codes that are part of the Flexible Modeling System infrastructure provided the data lie on a longitude/latitude, structured grid.
A WMS for geospatial stored in CF-compliant NetCDF files. ncWMS relies heavily on the Java NetCDF interface from Unidata. This library does a lot of the work of metadata and data extraction. In particular the GridDatatype class is frequently used to provide a high-level interface to gridded geospatial NetCDF files. The library will also read from NetCDF files on HTTP servers and from OPeNDAP servers. ncWMS has now been integrated with the THREDDS Data Server.
A comprehensive community model that predicts waves, currents, sediment transport and bathymetric change in the nearshore ocean, between the shoreline and about 10 m water depth. The model consists of a "backbone", i.e., the master program, handling data input and output as well as internal storage, together with a suite of "modules", each of which handles a focused subset of the physical processes being studied. A wave module will model wave transformation over arbitrary coastal bathymetry and predict radiation stresses and wave induced mass fluxes. A circulation module will model the slowly varying current field driven by waves, wind and buoyancy forcing, and will provide information about the bottom boundary layer structure. A seabed module will model sediment transport, determine the bedform geometry, parameterize the bedform effect on bottom friction, and compute morphological evolution resulting from spatial variations in local sediment transport rates.
A computational fluid dynamics solver based on the spectral element method.
A state-of-the-art modeling framework for oceanographic research, operational oceanography seasonal forecast and climate studies.
A Python package which implements various diagnostics for NEMO model output.
A robust (fully ACID) transactional property graph database. Due to its graph data model, Neo4j is highly agile and blazing fast. For connected data operations, Neo4j runs a thousand times faster than relational databases.
NetCDF extension for finite element grids.
A library providing high-performance I/O while still maintaining file-format compatibility with Unidata’s NetCDF.
Nonlinear multivariate and time series analysis by neural network methods.
The NFFT (nonequispaced fast Fourier transform or nonuniform fast Fourier transform) is a C subroutine library for computing the nonequispaced discrete Fourier transform (NDFT) and its generalisations in one or more dimensions, of arbitrary input size, and of complex data.
A parallel FFT software library based on MPI.
A parallel software library for the calculation of three-dimensional nonequispaced FFTs based. It is available under GPL licence. The parallelization is based on MPI. PNFFT depends on the PFFT and FFTW software library.
A Python interface for NFFT.
Numerical Information Field Theory", is a versatile library designed to enable the development of signal inference algorithms that operate regardless of the underlying spatial grid and its resolution. Its object-oriented framework is written in Python, although it accesses libraries written in Cython, C++, and C for efficiency.
NIFTY offers a toolkit that abstracts discretized representations of continuous spaces, fields in these spaces, and operators acting on fields into classes. Thereby, the correct normalization of operations on fields is taken care of automatically without concerning the user. This allows for an abstract formulation and programming of inference algorithms, including those derived within information field theory. Thus, NIFTY permits its user to rapidly prototype algorithms in 1D, and then apply the developed code in higher-dimensional settings of real world problems. The set of spaces on which NIFTY operates comprises point sets, n-dimensional regular grids, spherical spaces, their harmonic counterparts, and product spaces constructed as combinations of those.
Cloud computing for science.
Time-series analysis for neuroscience in Python.
Nonlinear principal component analysis (NLPCA) is commonly seen as a nonlinear generalization of standard principal component analysis (PCA). It generalizes the principal components from straight lines to curves (nonlinear). Thus, the subspace in the original data space which is described by all nonlinear components is also curved. Nonlinear PCA can be achieved by using a neural network with an autoassociative architecture also known as autoencoder, replicator network, bottleneck or sandglass type network. Such autoassociative neural network is a multi-layer perceptron that performs an identity mapping, meaning that the output of the network is required to be identical to the input. However, in the middle of the network is a layer that works as a bottleneck in which a reduction of the dimension of the data is enforced. This bottleneck-layer provides the desired component values (scores).
Nonlinear Laplacian spectrum analysis.
NOVAS is an integrated package of subroutines and functions for computing various commonly needed quantities in positional astronomy. The package can provide, in one or two subroutine or function calls, the instantaneous coordinates of any star or planet in a variety of coordinate systems. At a lower level, NOVAS also supplies astrometric utility transformations, such as those for precession, nutation, aberration, parallax, and the gravitational deflection of light. The computations are accurate to better than one milliarcsecond. The NOVAS package is an easy-to-use facility that can be incorporated into data reduction programs, telescope control systems, and simulations. The U.S. parts of The Astronomical Almanac are prepared using NOVAS. Three editions of NOVAS are available: Fortran, C, and Python.
The algorithms used by NOVAS 3.1 are based on a vector and matrix formulation that is rigorous and does not use spherical trigonometry at any point. Objects inside and outside the solar system are treated similarly. The position vectors formed and operated on by NOVAS place each object at its relevant distance (in AU) from the solar system barycenter.
Released in late 2009, NOVAS 3.0 provided greater accuracy of star and planet position calculations (apparent places) by including several small effects not implemented in the NOVAS 2.0 code of 1998. NOVAS 3.0 also fully implemented recent resolutions by the International Astronomical Union (IAU) on positional astronomy, including new reference system definitions and updated models for precession and nutation. The paper by Kaplan et al. (1989, Astron. J. 97, 1197) describes the overall computational strategy used by NOVAS, although many of the individual algorithms described there have been improved. USNO Circular 179 describes the IAU recommendations that underpin much of NOVAS 3.0 and is the basic reference for NOVAS algorithms relating to time, Earth orientation, and the transformations between various astronomical reference systems. The current version, NOVAS 3.1, provides some new capabilities and fixes some bugs.
The non-parametrric entropy estimation toolbox includes estimators for entropy, mutual information, and conditional mutual information for both continuous and discrete variables. Additionally it includes a KL Divergence estimator for continuous distributions and mutual information estimator between continuous and discrete variables.
A package designed to address the problem of non-parametric statistical modeling of probability densities and regression surfaces. This type of modeling becomes very useful when there is little prior information available to justify an assumption that the data belongs to a certain parametric family of distributions or curves. The main intended application is fast and detailed modeling of the response and transfer functions of particle detectors, but the package is sufficiently general and can be used for solving a variety of statistical analysis problems from other areas. Both univariate and multivariate models are supported, and a number of original algorithms are implemented.
A general purpose library for multiphysics simulations based on finite elements.
A micromagnetic simulation package.
A data mining benchmark suite containing a mix of several representative data mining applications from different application domains. This benchmark is intended for use in computer architecture research, systems research, performance evaluation, and high-performance computing.
The Numenta Platform for Intelligent Computing, comprises a set of learning algorithms that were first described in a white paper published by Numenta in 2009. The learning algorithms faithfully capture how layers of neurons in the neocortex learn.
A software allowing synchronized exchanges of coupling information between numerical codes representing different components of the climate system. OASIS3-MCT, the new version of the OASIS coupler interfaced with the Model Coupling Toolkit (MCT) from the Argonne National Laboratory, offers today a fully parallel implementation of coupling field regridding and exchange. Low-intrusiveness, portability and flexibility are OASIS3-MCT key design concepts. OASIS3-MCT supports coupling of general two-dimensional fields. Unstructured grids and 3D grids are also supported using a one dimension representation of the two or three dimensional structures. Thanks to MCT, all transformations, including regridding, are executed in parallel on the set of source or target component processes and all coupling exchanges are now executed in parallel directly between the components via Message Passing Interface (MPI). OASIS3-MCT also supports file I/O using NetCDF, allowing an easy switch between the coupled and forced modes. In the current version, the implementation of this functionality is however non parallel with the reading/writing of the fields performed by the master process only.
KML currently handles spatial and temporal tags, but not data content. The below is an attempt to devise a content schema and mapping which would allow not only the display of data content, but also some meaningful data sharing within the observations community using KML/KMZ as a data transport mechanism.
A package in the R statistical language that helps Oceanographers do their work.
Ocean C-grid model setup and analysis tools, for the numerical mariner.
Provides interactive exploration, analysis and visualization of oceanographic and other geo-referenced profile or sequence data. It is available for all major computer platforms and currently has more than 20,000 registered users. ODV has a very rich set of interactive capabilities and supports a very wide range of plot types. This makes ODV ideal for visual and automated quality control.
The OpenFabrics Enterprise Distribution (OFED™) is open-source software for RDMA and kernel bypass applications. OFED includes kernel-level drivers, channel-oriented RDMA and send/receive operations, kernel bypasses of the operating system, both kernel and user-level application programming interface (API) and services for parallel message passing (MPI), sockets data exchange (e.g., RDS, SDP), NAS and SAN storage (e.g. iSER, NFS-RDMA, SRP) and file system/database systems. The network and fabric technologies that provide RDMA performance with OFED include: legacy 10 Gigabit Ethernet, iWARP for Ethernet, RDMA over Converged Ethernet (RoCE), and 10/20/40 Gigabit InfiniBand.
A self-contained C++ class library for the automatic layout of diagrams. OGDF offers sophisticated algorithms and data structures to use within your own applications or scientific projects.
A command line utility that converts a graph layout stored as GML file into a graphics file. Supported graphics formats are the bitmap formats PNG, JPEG, TIFF and the vector graphics formats SVG, PDF, EPS. It is based on OGDF and uses Qt 4 for high-quality graphics rendering.
An optimizing compiler for the Itanium and x86-64 microprocessor architectures. It derives from the SGI compilers for the MIPS R10000 processor, called MIPSPro. It was initially released in 2000 as GNU GPL software under the name Pro64. The following year, University of Delaware adopted the project and renamed the compiler to Open64. It now mostly serves as a research platform for compiler and computer architecture research groups. Open64 supports Fortran 77/95 and C/C++, as well as the shared memory programming model OpenMP. It can conduct high-quality interprocedural analysis, data-flow analysis, data dependence analysis, and array region analysis.
A tool for automatic differentiation of numerical computer programs.
An optimized BLAS library based on GotoBLAS2 1.13 BSD version.
OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products. The library has more than 2500 optimized algorithms, which includes a comprehensive set of both classic and state-of-the-art computer vision and machine learning algorithms. These algorithms can be used to detect and recognize faces, identify objects, classify human actions in videos, track camera movements, track moving objects, extract 3D models of objects, produce 3D point clouds from stereo cameras, stitch images together to produce a high resolution image of an entire scene, find similar images from an image database, remove red eyes from images taken using flash, follow eye movements, recognize scenery and establish markers to overlay it with augmented reality, etc.
An open interface standard for (and free implementation of) a set of tools to quickly implement data-assimilation and calibration for arbitrary numerical models. OpenDA wants to stimulate the use of data-assimilation and calibration by lowering the implementation costs and enhancing the exchange of software among researchers and end-users.
OpenFOAM has an extensive range of features to solve anything from complex fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics and electromagnetics. It includes tools for meshing, notably snappyHexMesh, a parallelised mesher for complex CAD geometries, and for pre- and post-processing. Almost everything (including meshing, and pre- and post-processing) runs in parallel as standard, enabling users to take full advantage of computer hardware at their disposal.
Source Installation: http://www.openfoam.org/download/source.php
An open source C++ toolkit designed to assist the creative process by providing a simple and intuitive framework for experimentation. The toolkit is designed to work as a general purpose glue, and wraps together several commonly used libraries.
A C++ template library for discrete factor graph models and distributive operations on these models. It includes state-of-the-art optimization and inference algorithms beyond message passing. OpenGM handles large models efficiently, since (i) functions that occur repeatedly need to be stored only once and (ii) when functions require different parametric or non-parametric encodings, multiple encodings can be used alongside each other, in the same model, using included and custom C++ code. No restrictions are imposed on the factor graph or the operations of the model. OpenGM is modular and extendible. Elementary data types can be chosen to maximize efficiency. The graphical model data structure, inference algorithms and different encodings of functions inter-operate through well-defined interfaces. The binary OpenGM file format is based on the HDF5 standard and incorporates user extensions automatically.
A high-performance implementation of the Myrinet Express message-passing stack over generic Ethernet networks. It provides application-level with wire-protocol compatibility with the native MXoE (Myrinet Express over Ethernet) stack.
Open Navigation Surface
A design for a databased alternative to traditional methods of representing bathymetric data. It aims to preserve the highest level of detail in every bathymetric dataset and provide methods for their combination and manipulation to generate multiple products for both hydrographic and non-hydrographic purposes. The advantages of the method over traditional schemes are such that a number of commercial vendors have adopted the technology. However, this means that there is a strong requirement for a method to communicate results in a vendor neutral technology. The Open Navigation Surface (ONS) project was designed to fill this gap by implementing a freely available source-code library to read and write all of the information required for a Navigation Surface.
Python package for universal numerical optimization.
Software allowing the concurrent execution and the intercommunication of programs based on in-house as well as commercial codes.
An open source high performance 3D graphics toolkit, used by application developers in fields such as visual simulation, games, virtual reality, scientific visualization and modelling.
A C++ terrain rendering SDK. Just create a simple XML file, point it at your imagery, elevation, and vector data, load it into your favorite OpenSceneGraph application, and go! osgEarth supports all kinds of data and comes with lots of examples to help you get up and running quickly and easily
Open source software for building private and public clouds.
A Python client for the OpenStack Nova API.
A highly available, distributed, eventually consistent object/blob store. Organizations can use Swift to store lots of data efficiently, safely, and cheaply.
A cross-platform (Windows, Mac, and Linux) collection of software tools to support whole building energy modeling using EnergyPlus and advanced daylight analysis using Radiance. OpenStudio is an open source project to facilitate community development, extension, and private sector adoption. OpenStudio includes graphical interfaces along with a Software Development Kit (SDK).
A scientific library usable as a Python module dedicated to the treatment of uncertainties.
An open source, optimizing compiler suite for C, C++ and Fortran 95. It supports a variety of architectures including IA-32, X86_64, IA-64. To achieve portability, OpenUH is able to emit optimized C or Fortran 77 code that may be compiled by a native compiler on other platforms. The supporting runtime libraries are also portable - the OpenMP runtime library is based on the portable Pthreads interface while the Coarray Fortran runtime library is based, optionally, on the portable GASNet or ARMCI communications interfaces.
Multi-modal medical and brain data visualization.
An expandable remote sensing and imagery analysis software platform.
Data mining through visual programming or Python scripting. Components for machine learning. Add-ons for bioinformatics and text mining. Packed with features for data analytics.
A set of algorithmic components, adapted to large remote sensing images, which allow to capitalize the methodological know how, and therefore use an incremental approach to benefit from the results of the methodological research. ORFEO Toolbox (OTB) is distributed as an open source library of image processing algorithms. As the motto of OTB goes, Orfeo Toolbox is not a black box, OTB encourages full access to the details of all the algorithms. OTB is based on the medical image processing library ITK and offers particular functionalities for remote sensing image processing in general and for high spatial resolution images in particular. Targeted algorithms for high resolution optical images (SPOT, Quickbird, Worldview, Landsat, Ikonos), hyperspectral sensors (Hyperion) or SAR (TerraSarX, ERS, Palsar) are available.
Portable C++ libraries for advanced machine and robot control.
OSL, the Orléans Skeleton Library provides a set of data parallel skeletons which follow the BSP model of parallel computation. OSL is a library for C++ currently implemented on top of MPI and it uses meta-programming techniques to offer a good efficiency. Our goal is thus to provide an easy to use library for a widely used programming language and that allows simple reasoning about parallel performances based on a simple and portable cost model.
A constructive parallel skeleton library written in C++ with MPI intended for distributed environments such as PC clusters. SkeTo provides data parallel skeletons for lists (distributed one-dimensional arrays), matrices (distributed two-dimensional arrays), and trees (distributed binary trees). SkeTo enables users to write parallel programs as if they were sequential, since the distribution, gathering, and parallel computation of data are concealed within constructors of data types or definitions of parallel skeletons.
An object-oriented code framework for solving partial differential equations (PDEs).
Software that gives you universal access to your files through a web interface or WebDAV. It also provides a platform to easily view & sync your contacts, calendars and bookmarks across all your devices and enables basic editing right on the web. Installation has minimal server requirements, doesn’t need special permissions and is quick. ownCloud is extendable via a simple but powerful API for applications and plugins.
A Python package for client programming with Open Geospatial Consortium (OGC) web service (hence OWS) interface standards, and their related content models. It supports WMS, WFS, WCS, WMC, SOS, SensorML, CSW, WPS, Filter, OWS Commmon, etc.
Parallel Three-Dimensional Fast Fourier Transforms is a library for large-scale computer simulations on parallel platforms. 3D FFT is an important algorithm for simulations in a wide range of fields, including studies of turbulence, climatology, astrophysics and material science.
The p4est software library enables the dynamic management of a collection of adaptive octrees, conveniently called a forest of octrees. p4est is designed to work in parallel and scale to hundreds of thousands of processor cores.
A Python package which allows you to perform arithmetic on random variables just like you do with ordinary program variables. The variables can follow practically any distribution.
The Portable Application Code Toolkit is a comprehensive, integrated, and portable software development environment created for applications having unique requirements not met with available software. By defining a single, higher level, standard programming interface, it shields appplication developers from the plethora of different hardware architectures and operating systems and their non-standard features. PACT is a set of libraries and utilities that easily integrates into your software project.
Python Bindings: https://wci.llnl.gov/codes/pact/pypact.html
Paegan attempts to fill the need for a high level common data model (CDM) library for array based met/ocean data stored in netCDF files or distributed over OPeNDAP.
A parallelized version of NCO. An API for data-parallel analysis of geodesic climate data as well as the set of data-parallel processing tools based on this API.
A library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
A universal document converter.
An open-source, multi-platform data analysis and visualization application. ParaView users can quickly build visualizations to analyze their data using qualitative and quantitative techniques. The data exploration can be done interactively in 3D or programmatically using ParaView’s batch processing capabilities.
ParaView was developed to analyze extremely large datasets using distributed memory computing resources. It can be run on supercomputers to analyze datasets of terascale as well as on laptops for smaller data.
A parallel visualization tool for astrophysical simulation implemented as a ParaView plugin.
An pen source tool-kit plug-in designed to work with the ParaView project. This project provides space physics tools for analysis of model and space-craft data. We are currently developing the basic functionality necessary to perform space weather science and forecasting tasks.
A ParaView plug-in interfaced around the H5FDdsm driver for steering and visualizing in-situ HDF5 output of simulation codes.
The MADAI Workbench is a custom version of ParaView that includes additional filters, file loaders, and visualization techniques developed by the MADAI group.
ParaView plugin containing a number of useful classes that can be used in the processing of meshless data.
Incorporates the provenance management capabilities of VisTrails into ParaView. All of the actions a user performs while building and modifying a pipeline in ParaView are captured by the plugin. This allows navigation of all of the pipeline versions that have previously been explored.
An industrial strength, interactive, mono- or stereoscopic viewer for 4-dimensional datasets. It is written in C++/OpenGL.
Parallel analysis tools and new visualization techniques for ultra-large climate data sets.
The parallel gridded analysis library.
A parallel version of NCL that runs NCL scripts in parallel and performs data analysis using ParGAL.
Produces over 600 plots and tables from CCSM (CAM) monthly netcdf files.
Pattern is a web mining module for the Python programming language. It bundles tools for data mining (Google + Twitter + Wikipedia API, web crawler, HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, k-means, k-NN, SVM) and network analysis (graph centrality & visualization).
The Point Cloud Library is a standalone, large scale, open project for 2D/3D image and point cloud processing.
The Parallel Data Assimilation Framework - PDAF - is a software environment for ensemble data assimilation. PDAF simplifies the implementation of the data assimilation system with existing numerical models. With this, users can obtain a data assimilation system with less work and can focus on applying data assimilation.
Program Database Toolkit (PDT) is a framework for analyzing source code written in several programming languages and for making rich program knowledge accessible to developers of static and dynamic analysis tools. PDT implements a standard program representation, the program database (PDB), that can be accessed in a uniform way through a class library supporting common PDB operations.
An open source C++ solver framework. It is based upon the fact that spacetrees, a generalisation of the classical octree concept, yield a cascade of adaptive Cartesian grids. Consequently, any spacetree traversal is equivalent to an element-wise traversal of the hierarchy of the adaptive Cartesian grids. The software Peano realises such a grid traversal and storage algorithm, and it provides hook-in points for applications performing per-element, per-vertex, and so forth operations on the grid. It also provides interfaces for dynamic load balancing, sophisticated geometry representations, and other features.
This software framework implements a NURBS-based Galerkin finite element method (FEM), popularly known as isogeometric analysis (IGA). It is heavily based on PETSc, the Portable, Extensible Toolkit for Scientific Computation. PETSc is a collection of algorithms and data structures for the solution of scientific problems, particularly those modeled by partial differential equations (PDEs). PETSc is written to be applicable to a range of problem sizes, including large-scale simulations where high performance parallel is a must. PetIGA can be thought of as an extension of PETSc, which adds the NURBS discretization capability and the integration of forms. The PetIGA framework is intended for researchers in the numeric solution of PDEs who have applications which require extensive computational resources.
Branch of PETSc with OpenMP support.
The Persistent Homology Algorithm Toolbox contains methods for computing the persistence pairs of a filtered cell complex represented by an ordered boundary matrix with Z2 coefficients.
The Python Imaging Library (PIL) adds image processing capabilities to your Python interpreter. This library supports many file formats, and provides powerful image processing and graphics capabilities.
Plan 9 from Bell Labs is a research system developed at Bell Labs starting in the late 1980s. Its original designers and authors were Ken Thompson, Rob Pike, Dave Presotto, and Phil Winterbottom. They were joined by many others as development continued throughout the 1990s to the present.
Plan 9 demonstrates a new and often cleaner way to solve most systems problems. The system as a whole is likely to feel tantalizingly familiar to Unix users but at the same time quite foreign.
In Plan 9, each process has its own mutable name space. A process may rearrange, add to, and remove from its own name space without affecting the name spaces of unrelated processes. Included in the name space mutations is the ability to mount a connection to a file server speaking 9P, a simple file protocol. The connection may be a network connection, a pipe, or any other file descriptor open for reading and writing with a 9P server on the other end. Customized name spaces are used heavily throughout the system, to present new resources (e.g., the window system), to import resources from another machine (e.g., the network stack), or to browse backward in time (e.g., the dump file system).
A math-enabled Web 3.0 information portal — as such it combines the social, user generated Web (Web 2.0) with semantic features. The Planetary System can be instantiated by a user community to individual "planets" that aggregate material on a specific topic of joint interest. These sites (e.g. PlanetMath, the OAFF or PantaRhei, more) allow community members to access, interact with, discuss, create, and enhance knowledge items, creating a joint knowledge resource. The main difference between sites like these, and Web 2.0 portals like Wikipedia, is that the planets concentrate on semantic interactions and scientific topics (where semantic annotations of complex objects is worthwhile).
A coupled system of climate components for Earth, Mars and Titan.
The Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) project aims to address the critical and highly disruptive situation that is facing the Linear Algebra and High Performance Computing community due to the introduction of multi-core architectures.
A LaTeX document processing framework written entirely in Python. It currently comes bundled with an XHTML renderer (including multiple themes), as well as a way to simply dump the document to a generic form of XML. Other renderers can be added as well and are planned for future releases.
A Python library for distributing computations across the free computing units (CPUs and GPUs) available in a small network of multicore computers. Playdoh supports independent (embarassingly) parallel problems as well as loosely coupled tasks such as global optimizations, Monte Carlo simulations and numerical integration of partial differential equations.
A numerical library for C and C++ programmers. It is thread-safe and suitable for use in parallel environments.
A distributed file format conversion service built on top of third party software. Designed for the purpose of empirically estimating information loss across file format conversions, Polyglot uses software servers to access the file input/output functionality of software installed a number of remote machines. The Polyglot server catalogs this functionality and uses the union of their capabilities to perform a relatively large number of conversions depending on what is installed.
An air quality modeling system.
A Python tool for data processing and visualization in atmospheric sciences.
An ocean circulation model derived from earlier models of Bryan, Cox, Semtner and Chervin in which depth is used as the vertical coordinate. The model solves the three-dimensional primitive equations for fluid motions on the sphere under hydrostatic and Boussinesq approximations. Spatial derivatives are computed using finite-difference discretizations which are formulated to handle any generalized orthogonal grid on a sphere, including dipole and tripole grids which shift the North Pole singularity into land masses to avoid time step constraints due to grid convergence.
A programming language, development environment, and online community. Since 2001, Processing has promoted software literacy within the visual arts and visual literacy within technology. Initially created to serve as a software sketchbook and to teach computer programming fundamentals within a visual context, Processing evolved into a development tool for professionals.
Processing continues to be an alternative to proprietary software tools with restrictive and expensive licenses, making it accessible to schools and individual students. Its open source status encourages the community participation and collaboration that is vital to Processing’s growth. Contributors share programs, contribute code, and build libraries, tools, and modes to extend the possibilities of the software. The Processing community has written more than a hundred libraries to facilitate computer vision, data visualization, music composition, networking, 3D file exporting, and programming electronics.
Processing.js is the sister project of the popular Processing visual programming language, designed for the web. Processing.js makes your data visualizations, digital art, interactive animations, educational graphs, video games, etc. work using web standards and without any plug-ins. You write code using the Processing language, include it in your web page, and Processing.js does the rest. It’s not magic, but almost.
A statistical relational learning and reasoning system that supports efficient learning and inference in relational domains. We provide an extensive set of open-source tools for both undirected and directed statistical relational models.
Though ProbCog is a general-purpose software suite, it was designed with the particular needs of technical systems in mind. Our methods are geared towards practical applicability and can easily be integrated into other applications. The tools for relational data collection and transformation facilitate data-driven knowledge engineering, and the availability of graphical tools makes both learning or inference sessions a user-friendly experience. Scripting support enables automation, and for easy integration into other applications, we provide a client-server library.
A Python package for rapidly developing computer models and numerical methods. It is focused on models of continuum mechanical processes described by partial differential equations and on discretizations and solvers for computing approximate solutions to these equations.
A high-performance, robust, memory efficient, and scalable software for solving large sparse symmetric and unsymmetric linear systems of equations on shared-memory and distributed-memory architectures using thousands of compute cores. PSPIKE combines the robustness of a direct linear solver and the performance scalability of an iterative linear solver.
Features of the library version: Unsymmetric, or symmetric systems, real, parallel on distributed-memory clusters, combinatorial graph algorithms
A C++ library that handles piecewise linear bijections between triangulated surfaces. These surfaces can be of arbitrary shape and need not even be manifolds.
A PostgreSQL adapter for the Python programming language. At its core it fully implements the Python DB API 2.0 specifications. Several extensions allow access to many of the features offered by PostgreSQL.
Brings state-of-the-art parallel I/O concepts to production parallel systems.
Partial wavelet coherence is a technique similar to partial correlation that helps identify the resulting wavelet coherence between two time series after eliminating the influence of their common dependence. Multiple wavelet coherence, akin to multiple correlation, is useful in seeking the resulting wavelet coherence of multiple independent variables on a dependent one.
A Matlab package for performing crosswavelet and wavelet coherence analysis.
Algebraic multigrid solvers in Python.
A hyperbolic PDE solver in 1D, 2D, and 3D, including mapped grids and surfaces, built on Clawpack.
A specialized version of some Clawpack and AMRClaw routines that have been modified to work well for certain geophysical flow problems.
Currently the focus is on 2d depth-averaged shallow water equations for flow over varying topography. The term bathymetry is often used for underwater topography (sea floor or lake bottom), but in this documentation and in the code the term topography is often used to refer to either.
A Python package designed to accomplish some usual tasks needed during the analysis of climate variability. It provides functions to handle simple I/O operations, handling of COARDS-compliante netCDF files, EOF analysis, SVD and CCA analysis of coupled data sets, some linear digital filters, kernel based probability density function estimation and access to DCDFLIB.C library from Python.
A suite of tools to process, analyze, visualize and benchmark scientific model output against each other or against observational data. It is in particular useful for analyzing in an efficient way output from climate model simulations.
A unifying multibody dynamics algorithm development workbench.
A project to bring CSP (Communicating Sequential Processes) to Python.
Fully implements the OpenGIS Catalogue Service Implementation Specification. Allows for the publishing and discovery of geospatial metadata. Existing repositories of geospatial metadata can also be exposed via OGC:CSW 2.0.2, providing a standards-based metadata and catalogue component of spatial data infrastructures.
A Python package for accessing Nvidia‘s CUDA parallel computation API.
Provides PyCUDA bindings for the CULA port of LAPACK to NVIDIA’s CUDA GPGPU programming environment. Mixing PyCUDA-style kernel code and CULA device function calls is supported.
Exterior calculus is the generalization of vector calculus to manifolds. PyDEC is a Python library for computations related to the discretization of exterior calculus which includes numerical solution of partial differential equations. It is also useful for purely topological computations. Thus PyDEC facilitates inquiry into both physical problems on manifolds as well as purely topological problems on abstract complexes. It uses efficient algorithms for constructing the operators and objects and related topological problems. Our algorithms are formulated in terms of high-level matrix operations which extend to arbitrary dimension. As a result, our implementations map well to the facilities of numerical libraries such as NumPy and SciPy. The availability of such libraries makes Python suitable for prototyping numerical methods. The code and the companion paper includes examples where we demonstrate how PyDEC is used to solve physical and topological problems.
A sophisticated & integrated simulation and analysis environment for dynamical systems models of physical systems (ODEs, DAEs, maps, and hybrid systems). PyDSTool is platform independent, written primarily in Python with some underlying C and Fortran legacy code for fast solving. It makes extensive use of the numpy and scipy libraries. PyDSTool supports symbolic math, optimization, phase plane analysis, continuation and bifurcation analysis, data analysis, and other tools for modeling — particularly for biological applications.
A workflow that utlizes an array of scientific tools written in the Python programming language to study multibody dynamics. The core of this toolset is the SymPy mechanics package which generates symbolic equations of motion for complex multibody systems.
Provides scientific-grade astronomical computations for the Python programming language. Given a date and location on the Earth’s surface, it can compute the positions of the Sun and Moon, of the planets and their moons, and of any asteroids, comets, or earth satellites whose orbital elements the user can provide. Additional functions are provided to compute the angular separation between two objects in the sky, to determine the constellation in which an object lies, and to find the times at which an object rises, transits, and sets on a particular day.
A Python toolset providing access to GDP functionality.
A software library intended to simplify the management, analysis, and visualization of gridded geophysical datasets such as those generated by climate models. The library provides three main advantages. Firstly, it can define a geophysical coordinate system for any given dataset, and allows operations to be carried conceptually in this physical coordinate system, in a way that is independent of the native coordinate system of a particular dataset. This greatly simplifies working with datasets from different sources. Secondly, the library allows mathematical operations to be performed on datasets which fit on disk but not in memory; this is useful for dealing with the extremely large datasets generated by climate models, and permits operations to be performed over networks. Finally, the library provides tools for visualizing these datasets in a scientifically useful way. The library is written in Python, and makes use of a number of existing packages to perform the underlying computations and to create plots.
The goal of this project is to allow the use of the entire Globus toolkit from Python, a high-level scripting language. SWIG is used to generate the necessary interface code.
The Python Parallel Global Multiobjective Optimizer is a scientific library providing a large number of optimisation problems and algorithms under the same powerful parallelization abstraction built around the generalized island-model paradigm. What this means to the user is that the available algorithms are all automatically parallelized (asynchronously, coarse-grained approach) thus making efficient use of the underlying multicore architecture.
A plotting library for Tkinter Python programmers. Pygmyplot is based on the popular and powerful matplotlib, but does not require the python programmer to know nitty-gritty details of matplotlib programming. However, pygmyplot provides access to all of matplotlib’s functionality just below the surface. Pygmyplot is designed to work more seamlessly with the Tkinter event loop than matplotlib’s own simplified wrapper, pyplot.
A Python interface to GrADS that provides an alternative method of scripting GrADS that can take advantage of the unique capabilities of Python, and gives you access to a wealth of numerical and scientific software available for this platform.
A python package used to construct, manipulate, and perform computations on 3D triangulated surfaces. It is a hand-crafted and pythonic binding for the GNU Triangulated Surface (GTS) Library.
A knowledge-based inference engine.
A Python package for creating, parsing, manipulating, and validating KML, a language for encoding and annotating geographic data.
A Python reincarnation of AMPL and GNU MathProg modeling language, implemented in pure Python, connecting to GLPK via PyGLPK. Create, optimize, report, change and re-optimize your model with Python, which offers numerous handy goodies. Being embedded in Python, you can take advantage of the other good things available in python: such as easy database access, graphical presentation of your solution, statistical analysis, or use pymprog for artificial intelligence in games, etc.
A suite of software packages necessary to build and run a Python Coupler like PyCCSM. MCT is a high performance regridding and parallel communication package designed to address issues of coupling multiple scientific models on different scales and grids to one another.
An implementation of Thomson’s (1982) multi-taper fourier spectral estimator plus a python interface. The core code is due to Lees and Park (1995) and uses the conventions of Percival and Walden (1993).
A Python package for coordinate-free symbolic math, based on Geometric Algebra (Clifford Algebra) and Geometric Calculus (Clifford Analysis).
A Python program to create nomographs/nomograms.
Lets you access the OpenCL parallel computation API from Python.
GThe pod package is an implementation of a Proper Orthogonal Decomposition method.
A pure-python graphics and GUI library built on PyQt4 / PySide and numpy. It is intended for use in mathematics / scientific / engineering applications. Despite being written entirely in python, the library is very fast due to its heavy leverage of numpy for number crunching and Qt’s GraphicsView framework for fast display.
Lets you write code that mixes Python and C data types any way you want, and compiles it into a C extension for Python.
A library that reads and writes ESRI shapefiles in Python. You can read and write shp, shx, and dbf files with all types of geometry. Everything in the public ESRI shapefile specification is implemented.
A package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data. PyTables is built on top of the HDF5 library, using the Python language and the NumPy package. It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code (generated using Cython), makes it a fast, yet extremely easy to use tool for interactively browse, process and search very large amounts of data. One important feature of PyTables is that it optimizes memory and disk resources so that data takes much less space (specially if on-flight compression is used) than other solutions such as relational or object oriented databases.
A GUI for browsing and editing files in both PyTables and HDF5 formats.
A Python implementation of the tensor toolkit.
Provides a seamless glue layer between Numpy and Boost.Ublas for use with Boost.Python.
A companion to PyUblas that exposes a variety of useful additions including a cross-language "operator" class for building matrix-free algorithms, CG and BiCGSTAB linear solvers that use the operator class, an ARPACK interface that uses it, a UMFPACK interface for sparse matrices, and an interface to the DASKR ODE solver.
The UNIfied COmplex network and RecurreNce analysis toolbox) is a fully object-oriented python package for the advanced analysis and modeling of complex networks. Above the standard measures of complex network theory such as degree, betweenness and clustering coefficient it provides some uncommon but interesting statistics like Newman’s random walk betweenness.pyunicorn features novel node-weighted (node splitting invariant) network statistics as well as measures designed for analyzing networks of interacting/interdependent networks.
Moreover, pyunicorn allows to easily construct networks from uni- and multivariate time series data (functional (climate) networks and recurrence networks). This involves linear and nonlinear measures of time series analysis for constructing functional networks from multivariate data as well as modern techniques of nonlinear analysis of single time series like recurrence quantification analysis (RQA) and recurrence network analysis.
Pyvisfile allows you to write a variety of visualization file formats, including Kitware’s XML-style Vtk data files, and Silo visualization files, as introduced by LLNL’s MeshTV and more recently used by the VisIt large-scale visualization program. pyvisfiles supports many mesh geometries, such such as unstructured and rectangular structured meshes, particle meshes, as well as scalar and vector variables on them. In addition, pyvisfile allows the semi-automatic writing of parallelization-segmented visualization files in both Silo and Vtk formats. For Silo files, pyvisfile also supports the writing of expressions as visualization variables.
Pyrex is a compiler. Therefore it is natural that people tend to go through an edit/compile/test cycle with Pyrex modules. But my personal opinion is that one of the deep insights in Python’s implementation is that a language can be compiled (Python modules are compiled to .pyc) files and hide that compilation process from the end-user so that they do not have to worry about it. Pyximport does this for Pyrex modules.
Python package to handle rings/eddies in the ocean.
A Python wrapper for the Rice Wavelet Toolbox.
A Python binding to the Qt library. The various parts that comprise PySide are:
apiextractor - Used by the binding generator to parse headers of a given library and merge this data with information provided by typesystem (XML) files, resulting in a representation of how the API should be exported to the chosen target language. The generation of source code for the bindings is performed by specific generators using the API Extractor library.
generatorrunner - A utility that parses a collection of header and typesystem files, generating other files (code, documentation, etc.) as result.
shiboken - A Python bindings generator that outputs CPython code.
pyside - Generates the Qt bindings.
pyside-tools - Four tools for PySide.
Why? This is used by Matplotlib.
General instructions on how to build this on Linux can be found at:
but there are some other problems I encountered.
First, Python must be built either as a shared library:
./configure --enable-shared --prefix=/opt/python2.7
or with the -fPIC flag set:
export CFLAGS=-fPIC export CPPFLAGS=-fPIC ./configure --prefix=/opt/python2.7
Second, if you want to use a version of Qt in a non-standard location such as /opt/qt-4.8.4, then you need to specify the following for all five packages:
export LD_LIBRARY_PATH=/opt/qt-4.8.4/lib:$LD_LIBRARY_PATH export PYTHONPATH=/opt/python2.7/lib/python2.7/site-packages:$PYTHONPATH export PATH=/opt/qt-4.8.4/bin:$PATH export PKG_CONFIG_PATH=/opt/qt-4.8.4/lib/pkgconfig:$PKGCONFIG
to get all the cmake Qt dependencies right. There is not an obvious way within cmake to specify a general alternate root directory for Qt, so we must do it in this roundabout way.
Also, for shiboken, if you have several versions of Python, you need to set the following variables either in CMakeCache.txt directory or indirectly via the ccmake command:
PYTHON_EXECUTABLE /usr/bin/python2.7 PYTHON_INCLUDE_DIR /opt/python2.7/include/python2.7 PYTHON_LIBRARY /opt/python2.7/lib/libpython2.7.a
Even after all that, the Python module is installed in the wrong place, i.e.
and I can’t figure out how to tweak the cmake stuff to get it in the right place. Thus, we must do so manually, i.e.
cp /usr/local/lib/python2.7/site-packages/shiboken.so /opt/python2.7/lib/python2.7/site-packages
A python-based implementation of the OGC SOS standard. PySOS is a lightweight set of scripts that work in conjunction with a web server to serve data from a relational database.
Python Computer Graphics Kit
A collection of Python modules, plugins and utilities that are meant to be useful for any domain where you have to deal with 3D data of any kind, be it for visualization, creating photorealistic images, Virtual Reality or even games.
Miscellaneous Python tools for oceanographers.
A LaTeX package that allows Python code entered within a TeX document to be executed, and the output to be included in the original document.
A Python to C++ compiler for a subset of the Python language.
Quantum GIS (QGIS) is a user friendly Open Source Geographic Information System (GIS) licensed under the GNU General Public License. QGIS supports vector, raster, and database formats.
A Python interface for the functionality of QGIS.
Open-source MAVLink Micro Air Vehicle Communication Protocol with lightweight serialization functions for microcontrollers. QGroundControl’s main interface protocol is MAVLink, a binary, serial stream protocol which QGroundControl can receive over UDP or serial links (radio modems).
Open-source software for simulating the dynamics of open quantum systems. The QuTiP library depends on the excellent Numpy and Scipy numerical packages. In addition, graphical output is provided by Matplotlib. QuTiP aims to provide user-friendly and efficient numerical simulations of a wide variety of Hamiltonians, including those with arbitrary time-dependence, commonly found in a wide range of physics applications such as quantum optics, trapped ions, superconducting circuits, and quantum nanomechanical resonators.
Python bindings for MAVLink.
A collection of Matlab files for 1D and 2D wavelet and filter bank design, analysis, and processing.
A Python implementation of he Web processing Service standard from Open Geospatial Consortium.
QUARK (QUeuing And Runtime for Kernels) provides a library that enables the dynamic execution of tasks with data dependencies in a multi-core, multi-socket, shared-memory environment. QUARK infers data dependencies and precedence constraints between tasks from the way that the data is used, and then executes the tasks in an asynchronous, dynamic fashion in order to achieve a high utilization of the available resources.
A bloody useful program.
A NetCDF package for R.
A package that provides R users with high-level facilities to generate KML, the Keyhole Markup Language for display in, e.g., Google Earth. By high-level, we mean that the R user does not have to (but can) create the XML directly herself. Instead, there are high-level functions which take care of these lower-level details.
A Scheme-based programming language designed for producing web applications, system programming and much more.
Enables Web-based geo data offerings and Big Data Analytics on multi-dimensional raster ("array") data of unlimited size.
Build an FM transmitter with a Raspberry Pi.
GPIO access library written in C for the BCM2835 used in the Raspberry Pi. It’s released under the GNU LGPLv3 license and is usable from C and C++ and many other languages with suitable wrappers (See below) It’s designed to be familiar to people who have used the Arduino “wiring” system.
A set of free software C libraries that provide support for the Resource Description Framework (RDF).
Redland is a library that provides a high-level interface for the Resource Description Framework (RDF) allowing the RDF graph to be parsed from XML, stored, queried and manipulated. Redland implements each of the RDF concepts in its own class via an object based API, reflected into the language APIs, currently Perl, PHP, Python and Ruby. Several classes providing functionality such as for parsers, storage are built as modules that can be loaded at compile or run-time as required.
The RandomizED Singular Value Decomposition library solves several matrix decompositions including singular value decomposition (SVD), principal component analysis (PCA), and eigen value decomposition. redsvd can handle very large matrix efficiently, and optimized for a truncated SVD of sparse matrices. For example, redsvd can compute a truncated SVD with top 20 singular values for a 100K x 100K matrix with 1M nonzero entries in less than one second. The algorithm is based on the randomized algorithm for computing large-scale SVD. Although it uses randomized matrices, the results is very accurate with very high probability. See the experiment part for the detail.
Matlab modules for the estimation of mean values and covariance matrices from incomplete datasets, and the imputation of missing values in incomplete datasets.
A virtual environment for ocean numerical simulations.
A three-dimensional numerical oceanic model intended for simulating currents, ecosystems, biogeochemical cycles, and sediment movement in various coastal regions. It is called the Regional Oceanic Modeling System (ROMS). This IRD version of the code, ROMS_AGRIF, makes use of the AGRIF grid refinement procedure developed at the LJK-IMAG and is accompanied by a powerful toolbox for ROMS pre- and post-processing: ROMSTOOLS.
The ROOT system provides a set of OO frameworks with all the functionality needed to handle and analyze large amounts of data in a very efficient way. Having the data defined as a set of objects, specialized storage methods are used to get direct access to the separate attributes of the selected objects, without having to touch the bulk of the data. Included are histograming methods in an arbitrary number of dimensions, curve fitting, function evaluation, minimization, graphics and visualization classes to allow the easy setup of an analysis system that can query and process the data interactively or in batch mode, as well as a general parallel processing framework, PROOF, that can considerably speed up an analysis.
The scripting, or macro, language and the programming language are all Cpp. The interpreter allows for fast prototyping of the macros since it removes the, time consuming, compile/link cycle. If more performance is needed the interactively developed macros can be compiled using a C++ compiler via a machine independent transparent compiler interface called ACliC.
The python programming language is a popular, open-source, dynamic language with an interactive interpreter. Its interoperability with other programming languages, both for extending python as well as embedding it, is excellent and many existing third-party applications and libraries have therefore so-called "python bindings." PyROOT provides python bindings for ROOT: it enables cross-calls from ROOT/CINT into python and vice versa, the intermingling of the two interpreters, and the transport of user-level objects from one interpreter to the other. PyROOT enables access from ROOT to any application or library that itself has python bindings, and it makes all ROOT functionality directly available from the python interpreter.
Rose is a group of utilities and specifications which aim to provide a common way to manage the development and running of scientific application suites in both research and production environments.
An open source compiler infrastructure to build source-to-source program transformation and analysis tools for large-scale C(C89 and C98), C(C98 and C++11), UPC, Fortran (77/95/2003), OpenMP, Java, Python and PHP applications.
A programming language designed for scientific computations.
A very simple, yet robust, Python interface to the R Programming Language. It can manage all kinds of R objects and can execute arbitrary R functions (including the graphic functions). All errors from the R language are converted to Python exceptions. Any module installed for the R system can be used from within Python.
A transparent python library for symmetrical remote procedure calls, clustering and distributed-computing. RPyC makes use of object-proxying, a technique that employs python’s dynamic nature, to overcome the physical boundaries between processes and computers, so that remote objects can be manipulated as if they were local.
A Python wrapper of libspatialindex that provides the spatial indexing features of the latter. The functionality includes nearest neighbor search, intersection search, multi-dimensional indexes, clustered indexes, bulk loading, deletion, disk serialization, and custom storage.
A parallel scalable spherical harmonic transform package.
An array programming language predominantly suited for application areas such as numerically intensive applications and signal processing. Its distinctive feature is that it combines high-level program specifications with runtime efficiency similar to that of hand-optimized low-level specifications. Key to the optimization process that facilitates these runtimes is the underlying functional model which also constitutes the basis for implicit parallelization. This makes SaC ideally suited for harnessing the full potential of modern Chip Multiprocessor Architectures.
A standardized API for developing distributed applications that can run on grid and cloud infrastructure. The SAGA API has an emphasis on job handling and monitoring, file transfer and management as well as distributed orchestration mechanisms.
SAGA-Python provides a Python module that is compliant with the OGF GFD.90 SAGA specification.
An open-source software that provides a generic platform for Pre- and Post-Processing for numerical simulation. It is based on an open and flexible architecture made of reusable components.
The Stochastic Assimilation for the Next Generation Ocean Model Applications package is a set of tools for data assimilation. There are tools for diagnostics, perturbations, transformations, and various utilities.
An object-functional programming and scripting language for general software applications, statically typed, designed to concisely express solutions in an elegant, type-safe and lightweight manner.
The Scientific Computation and Visualization Environment is an environment for scientific computation, data analysis and data visualization designed for scientists, engineers and students. The program incorporates many open-source software packages into a coherent interface using the concept of dynamic scripting
SCaVis can be used everywhere where an analysis of large numerical data volumes, data mining, statistical analysis and mathematics are essential (natural sciences, engineering, modeling and analysis of financial markets).
SCaVis is fully multiplatform and runs on any platform where Java is installed. As a Java application, SCaVis takes the full advantage of multicore processors.
An open source database technology product designed specifically to satisfy the demands of data-intensive scientific problems.
A collection of algorithms for image processing in Python.
Machine learning in Python.
A free and open source software for numerical computation providing a powerful computing environment for engineering and scientific applications. Scilab includes hundreds of mathematical functions. It has a high level programming language allowing access to advanced data structures, 2-D and 3-D graphical functions.
Sector/Sphere supports distributed data storage, distribution, and processing over large clusters of commodity computers, either within a data center or across multiple data centers. Sector is a high performance, scalable, and secure distributed file system. Sphere is a high performance parallel data processing engine that can process Sector data files on the storage nodes with very simple programming interfaces.
A circulation model for oceans and estuaries based on unstructured triangular grids.
Tools for working with the SELFE model.
A package for performing he various basic operations that are required in sequential data assimilation systems. These operations include square root or ensemble observational updates (with global or local parameterization of the forecast error statistics), adaptive statistical parameterizations, anamorphosis transformations, or the computation of truncated Gaussian estimators. SESAM also provides diagnostic tools, to compute observation representers, EOF decompositions or regional RMS misfits, and various utilities for extracting observations, converting between file formats or performing simple algebraic operations.
A software for solving systems of coupled partial differential equations (PDEs) by the finite element method in 2D and 3D. It can be viewed both as black-box PDE solver, and as a Python package which can be used for building custom applications. The word “simple” means that complex FEM problems can be coded very easily and rapidly.
A toolbox allowing to employ spatially adaptive sparse grids straightforward, flexibly, and without the vast initial overhead that has to be spent when implementing sparse grids and the corresponding algorithms. To be able to deal with different kinds of problems in a spatially adaptive way - ranging from interpolation and quadrature via the solution of differential equations to regression, classification, and more - a main motivation behind the development and all considerations was to create a toolbox which can be used in a very flexible and modular way by different users in different applications.
The main features of the sparse grid toolbox are efficiency and flexibility, both of which can sometimes be nasty rivals, for example if the reusability of an algorithm for different purposes requires extra data structures or control mechanisms, thus slowing down special algorithmic variants. To ensure performance at run-time, we use C++ for all performance critical parts. Considering flexibility, we have spent a great deal in ensuring modularity, reusability and the separation of data structures and algorithms. Furthermore, we provide the means to use the SG++ toolbox from within Python, Matlab, Java, and C++, of course.
A set of two applications to convert ESRI ShapeFiles into Google Earth KML.
A suite of data representation and management tools are based on a patented sheaf data model. The SheafSystem™ uses advanced mathematics - posets, lattices, sheaves, and fiber bundles - to revolutionize the handling of the complex, structure rich data sets of scientific computing. SheafSystem™ tools make it easy to construct, manipulate, store, retrieve, and inter-operate diverse representations of physical data.
The machine learning toolbox’s focus is on large scale kernel methods and especially on Support Vector Machines (SVM). It provides a generic SVM object interfacing to several different SVM implementations. The toolbox not only provides efficient implementations of the most common kernels, like the Linear, Polynomial, Gaussian and Sigmoid Kernel but also comes with a number of recent string kernels. SHOGUN is implemented in C++ and interfaces to Matlab(tm), R, Octave and Python.
A set of codes for spherical harmonic transforms.
SSHT - spin spherical harmonic transforms
S2 - functions on the sphere
S2LET - fast wavelets on the sphere
S2DW - steerable scale discretised wavelets on the sphere
FastCSWT - fast directional continuous spherical wavelt transform
FLAG - exact Fourier-Laguerre transform on the ball
FLAGLET - exact wavelets on the ball
SICOPOLIS (SImulation COde for POLythermal Ice Sheets) is a 3-d dynamic/thermodynamic model which simulates the evolution of large ice sheets. It was originally created as a part of the doctoral thesis by Greve (1995) in a version for the Greenland Ice Sheet. Since then, SICOPOLIS has been developed continuously and applied to problems of past, present and future glaciation of Greenland, Antarctica, the entire northern hemisphere and also the polar ice caps of the planet Mars.
The model is based on the shallow ice approximation for grounded ice and the shallow shelf approximation for floating ice (e.g., Greve and Blatter 2009). It is coded in Fortran 90 and uses finite difference discretisation on a staggered (Arakawa C) grid, the velocity components being taken between grid points. Its particularity is the detailed treatment of basal temperate layers (that is, regions with a temperature at the pressure melting point), which are positioned by fulfilling a Stefan-type jump condition at the interface to the cold ice regions. Within the temperate layers, the water content is computed, and its influence on the ice viscosity is taken into account.
The coding is based on a consequent low-tech philosophy. All structures are kept as simple as possible, and advanced coding techniques are only employed where it is deemed appropriate. The use of external libraries is kept at an absolute minimum. In fact, SICOPOLIS can be run without external libraries at all, which makes the installation very easy and fast.
An open source software to solve hyperbolic equations on dynamically changing fully-adaptive conforming 2D triangular grids. A kernel based way to solve hyperbolic problems and to apply steering during simulation is offered.
Scientists today have the ability to generate data at an unprecedented scale and rate and, as a result, they must increasingly turn to parallel data processing engines to perform their analyses. However, the simple execution model of these engines can make it difficult to implement efficient algorithms for scientific analytics. In particular, many scientific analytics require the extraction of features from data represented as either a multidimensional array or points in a multidimensional space. These applications exhibit significant computational skew, where the runtime of different partitions depends on more than just input size and can therefore vary dramatically and unpredictably. In SkewReduce project, we explore how to alleviate such skew problem in a large MapReduce cluster by requesting users minimal information of their analysis tasks.
A library for reading and writing a wide variety of scientific data to binary, disk files. The files Silo produces and the data within them can be easily shared and exchanged between wholly independently developed applications running on disparate computing platforms. Consequently, Silo facilitates the development of general purpose tools for processing scientific data. One of the more popular tools that process Silo data files is the VisIt visualization tool.
Silo supports gridless (point) meshes, structured meshes, unstructured-zoo and unstructured-arbitrary-polyhedral meshes, block structured AMR meshes, constructive solid geometry (CSG) meshes, piecewise-constant (e.g. zone-centered) and piecewise-linear (e.g. node-centered) variables defined on the node, edge, face or volume elements of meshes as well as the decomposition of meshes into arbitrary subset hierarchies including materials and mixing materials. In addition, Silo supports a wide variety of other useful objects to address various scientific computing application needs. Although the Silo library is a serial library, it has some key features which enable it to be applied quite effectively and scalable in parallel.
Architecturally, the library is divided into two main pieces; an upper-level application programming interface (API) and a lower-level I/O implementation called a driver. Silo supports multiple I/O drivers, the two most common of which are the HDF5 (Hierarchical Data Format 5) and PDB (Portable Data Base) drivers.
Simbody uses an advanced formulation of rigid body mechanics to provide results in Order(n) time for any set of n coordinates. This can be used for internal coordinate modeling of molecules, or for coarse-grained models based on larger chunks. It is also useful for large-scale mechanical models, such as neuromuscular models of human gait, robotics, avatars, and animation. Simbody can also be used in real time interactive applications for biosimulation as well as for virtual worlds and games.
This toolset was developed originally for SimTK by Michael Sherman at the Simbios Center at Stanford, with major contributions from Peter Eastman and others. Simbody descends directly from the public domain NIH Internal Variable Dynamics Module (IVM) facility for molecular dynamics developed and kindly provided by Charles Schwieters. IVM is in turn based on the spatial operator algebra of Rodriguez and Jain from NASA’s Jet Propulsion Laboratory (JPL), and Simbody has adopted that formulation.
See also PyCraft
An open source framework for building computer vision applications. With it, you get access to several high-powered computer vision libraries such as OpenCV – without having to first learn about bit depths, file formats, color spaces, buffer management, eigenvalues, or matrix versus bitmap storage.
A scripting language for stochastic structural mechanics based on Lua. Actually, SLangTNG provides additional functionality by wrapping C++ functions (involving additional C and FORTRAN libraries) in such a way that the C++ objects and methods are accessible from the Lua interpreter. This is done by an automatic wrapping process using SWIG. In addition to the mathematical algorithms, there is a binding to a GUI providing an interface to the interpreter, symbols and visualization.
A scalable bioinformatics workflow engine, i.e. a friendlier version of make.
A cluster-oriented implementation of self-organizing maps. It relies on MPI for distributing the workload, and it can be accelerated by CUDA on a GPU cluster. A sparse kernel is also included, which is useful for training maps on vector spaces generated in text mining processes.
Sosie is Only a Surface Interpolation Environment is SOSIE is a versatile tool that allows fast and high quality 2D and 3D interpolation of geophysical fields from a gridded domain to another. It is written in Fortran-90 and uses Netcdf as input and output file format. Compared to more widely used interpolation methods such as bilinear or bicubic splines, the Akima method allows, at an extremely low numerical cost, continuous and smooth interpolated fields without errors related to overshoots.
The Akima algorithm can only be used for interpolating fields given on non-distorted horizontal grids, i.e. on so-called lat-lon regular domains in which latitude and longitude arrays are 1D and only dependent of respectively j and i. A bilinear interpolation alternative is included in SOSIE and can be used for distorted input domains. However, there is no limitation regarding the type of the target grid: both regular and distorted target grids are supported by the Akima method.
A tool for N-dimensional geometric modeling with possibilities of parametrized calculations, numerical optimization, and solving systems of geometrical equations with automatic differentiation.
An an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly much more quickly than with disk-based systems like Hadoop MapReduce. To make programming faster, Spark provides clean, concise APIs in Scala, Java and Python. You can also use Spark interactively from the Scala and Python shells to rapidly query big datasets.
A collection of FORTRAN77 programs and subroutines facilitating computer modeling of geophysical processes. The package contains subroutines for computing common differential operators including divergence, vorticity, latitudinal derivatives, gradients, the Laplacian of both scalar and vector functions, and the inverses of these operators. For example, given divergence and vorticity, the package can be used to compute velocity components, then the Laplacian inverse can be used to solve the scalar and vector Poisson equations. The package also contains routines for computing the associated Legendre functions, Gauss points and weights, multiple fast Fourier transforms, and for converting scalar and vector fields between geophysical and mathematical spherical coordinates.
An object-oriented python interface to the NCAR SPHEREPACK library. Can perform spherical harmonic transforms to and from regularly spaced and gaussian lat/lon grids.
Spheroidal Wave Functions
This site provides source code for Fortran computer programs that calculate accurate values for Mathieu functions and both prolate and oblate spheroidal wave functions over extremely wide parameter ranges. The program matfcn delivers values for both angular and/or radial (i.e., modified) Mathieu functions of integer order. The program profcn calculates values for angular and/or radial prolate spheroidal wave functions. Oblfcn calculates corresponding values for oblate angular and radial functions. Matfcn performs calculations in double precision (real*8) arithmetic. Both profcn and oblfcn utilize quadruple precision (real*16) arithmetic to calculate accurate values over the widest possible parameter ranges.
A platform of Smoothed Particle Hydrodynamics (SPH) codes.
Developed to study free-surface flow phenomena where Eulerian methods can be difficult to apply, such as waves or impact of dam-breaks on off-shore structures. DualSPHysics is a set of C++, CUDA and Java codes designed to deal with real-life engineering problems.
A free and open source visualisation tool for exploring output from Smoothed Particle Hydrodynamics (SPH) simulations in one, two and three dimensions, focussed mainly on astrophysical applications. It is written in Fortran 90 and utilises the giza/cairo graphics libraries to do the actual plotting. It is based around a command-line menu structure but utilises the interactive capabilities of giza to manipulate data interactively in the plotting window.
A fast sparse matrix/array library written in Python and based on the C++ matrix library Eigen. A sparse matrix is one in which many of the elements are zeros, and by storing only non-zero elements, one can often make memory and computational savings over dense matrices which store all elements. The library supports (compressed) sparse matrices, sparse vectors and a number of linear algebra operations (such as the randomised SVD and matrix norm). Furthermore, SpPy has a similar interface to numpy so that existing code requires minimal change to work with sparse matrices and vectors.
A generic system for defining, writing and processing options files for scientific computer models. The interfaces to scientific computer models are frequently primitive, under-documented and ad-hoc text files. This makes using and developing the model in question difficult and error-prone. With Spud, the model developer need only write a rules file (schema) which defines the options which the model takes and the relationship between them. The Spud component Diamond then provides an automatically generated graphical user interface which guides the user and validates the user’s input against the schema. Diamond writes out an xml options file for use in Spud.
Toolkit for singular spectrum analysis.
A unified runtime system for heterogeneous multicore architectures.
STOQS (Spatial Temporal Oceanographic Query System) is a geospatial database web application designed for providing efficient access to in situ oceanographic measurement data across any dimension. Where "dimension" is considered in the broadest sense, for example: any spatial dimension, time, parameter, platform, or any other measured parameter data value.
SUNTANS is a nonhydrostatic, unstructured-grid, parallel, coastal ocean simulation tool that solves the Navier-Stokes equations under the Boussinesq approximation. The formulation is based on the method outlined by Casulli in his 1999 papers, where the free-surface and vertical diffusion are discretized with the theta-method, which eliminates the Courant condition associated with fast free-surface waves and the friction term associated with small vertical grid spacings at the free-surface and bottom boundaries. The grid employs z-levels in the vertical and triangular cells in the planform. When wetting and drying is absent, advection of momentum is accomplished with the second-order accurate unstructured-grid scheme of Perot (2000). In the presence of wetting and drying, the semi-Lagrangian formulation is employed. Scalar advection is accomplished semi-implicitly using the method of Gross (1999), in which continuity of volume and mass are guaranteed when wetting and drying is employed. The wetting and drying capabilities of SUNTANS enable its use for coastal as well as estuarine domains. The theta-method for the free-surface yields a two-dimensional Poisson equation, and the nonhydrostatic pressure is governed by a three-dimensional Poisson equation. These are both solved with the preconditioned conjugate gradient algorithm with diagonal preconditioning. SUNTANS is written in the C programming language, and the message-passing interface (MPI) is employed for use in a distributed memory parallel computing environment. Load balancing and grid-partitioning are being managed with the ParMETIS package.
A general-purpose numerical tool for simulating non-hydrostatic, free-surface, rotational flows and provides a general basis for describing complex changes to rapidly varied flows and wave transformations in coastal waters, ports and harbours.
An education-oriented code that implements simple Finite Volumes models that solve the shallow water equations - in a problem setting as it would be used for tsunami simulation. SWE has a modular design that allows parallelisation using different programming paradigms, such as MPI, OpenMP, or CUDA (further tests were done with Intel TBB/ArBB and OpenCL).
Swift lets you write parallel scripts that run many copies of ordinary programs concurrently.
Computes Love numbers of a spherically, self-gravitating Earth by viscoelastic normal mode method. TABOO also simulates the response of the Earth to surface loading. Post-glacial deformations can be modeled in terms of surface displacements, geoid height variations and changes of the inertia tensor of the Earth.
Solves the sea level equation, i.e. the integral equation that describes the spatiotemporal variations of sea level associated with the melting of late Quaternary ice sheets.
ALMA is a program that computes the "Love numbers" of a spherically symmetric Earth.
Taiga greatly simplifies the use of science data. It is a self-sufficient bundle of free/open source software that webifies major scientific data formats, such as NetCDF, HDF4 and HDF5. Through webification (w10n), meta attributes and data arrays inside a file can be directly retrieved, transformed, or manipulated using clear and meaningful URLs.
An open source python application that exposes data stores, e.g., HDF and NetCDF files, in the web way. It makes file inner components, such attributes and data arrays, directly addressable and accessible via well-defined and meaningful URLs. It can be installed as a command line tool and/or a ReSTful web service.
A web framework built on top of CherryPy for producing rich web applications that pair your data with cutting-edge visual interfaces.
An object-oriented distributed control system using CORBA and zeromq. In TANGO all objects are representations of devices. The devices can be on the same computer or distributed over a number of computers interconnected by a network. The network communication is done using CORBA or ZMQ depending on communication type. Communication can be synchronous, asynchronous or event driven. Configuration data is stored in a database. Programming support is provided for Cplusplus, Java and Python. Clients can be written in all three languages. Servers can also be written in Cplusplus , Java or Python. TANGO provides a kernel API which hides all the details of network access and provides object browsing, discovery and security features. Some ready to use graphical applications (DeviceTree, ATKPanel, Mango) allow you to graphically display data coming from your device(s). Graphical layers above the kernel API have been developed to reduce specific graphical client software development time. One exists for Java SWING (ATK), another one for C++ based on the Qt libraries (QTango) and still another one for Python PyQt (taurus).
A suite of tools used to design and execute scientific workflows and aid in silico experimentation.
A coordinated group of libraries for representing, processing, and visualizing scientific raster data. Teem includes command-line tools that permit the library functions to be quickly applied to files and streams, without having to write any code.
An integrated suite of solvers for use in the field of free-surface flow.
A swiss army knife library for web programming with a Python interface.
Tools for Energy Model Optimization and Assessment (Temoa) is an open source modeling framework for conducting energy system analysis. The core component of Temoa is a technology explicit energy economy optimization model. The design of Temoa is intended to fulfill a unique niche within the energy modeling community by addressing two critical shortcomings: an inability to conduct third party verification of published model-based results and the difficulty of conducting rigorous uncertainty analysis with large, complex models. Temoa leverages a modern revision control system to publicly archive model source and data, which enables third party verification of all published modeling work. In addition, Temoa represents the first EEO model to be designed - from its initial conceptualization - for operation within a high performance computing environment.
A library of GIS classes and functions for the development of GIS tools. TerraLib provides functions to decode geographical data, spatial analysis algorithms and a conceptual model for a geographical database.
A Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently.
A tool that allows NumPy and Theano to be used simultaneously with no additional code.
A parallel algorithms library which resembles the C++ Standard Template Library (STL).
A time series analysis python script with a graphical user interface. It allows to detect and quantify causal dependencies and create high-quality plots of the results.
An open source, standards-based software platform supported by leading mobile operators, device manufacturers, and silicon suppliers for multiple device categories such as smartphones, tablets, netbooks, in-vehicle infotainment devices, and smart TVs. Tizen offers an innovative operating system, applications, and a user experience that consumers can take from device to device.
A package designed to simplify the task of obtaining high resolution USGS datasets in formats readable by modelling and analysis software packages. Files are downloaded in GeoTIFF format, then converted to raw tiles for the WRF preprocessing system (WPS), and NetCDF tiles for the Local Analysis and Prediction System (LAPS) and the Space and Time Multiscale Analysis System (STMAS).
A MATLAB toolbox that implements basic operations with TT-tensors.
LaTeX classes for producing handouts and books according to the style of Edward R. Tufte and Richard Feynman.
An information visualization framework dedicated to the analysis and visualization of relational data. Tulip aims to provide the developer with a complete library, supporting the design of interactive information visualization applications for relational data that can be tailored to the problems he or she is addressing.
Written in C++ the framework enables the development of algorithms, visual encodings, interaction techniques, data models, and domain-specific visualizations. One of the goal of Tulip is to facilitates the reuse of components and allows the developers to focus on programming their application. This development pipeline makes the framework efficient for research prototyping as well as the development of end-user applications.
A free, cross-platform program that transparently handles calculations with numbers with uncertainties (like 3.14±0.01). It can also yield the derivatives of any expression. Calculations of results with uncertainties, or of derivatives, can be performed either in an interactive session (as with a calculator), or in programs written in the Python programming language. Existing calculation code can run with little or no change. Whatever the complexity of a calculation, this package returns its result with an uncertainty as predicted by linear error propagation theory. It automatically calculates derivatives and uses them for calculating uncertainties. Almost all uncertainty calculations are performed analytically. Correlations between variables are automatically handled, which sets this module apart from many existing error propagation codes.
A suite of Virtual Geoscience Simulation Tools for modelling discontinuous systems, i.e. particulate, granular, blocky, layered, fracturing and fragmenting systems.
VPython is the Python programming language plus a 3D graphics module called "visual" originated by David Scherer in 2000. VPython makes it easy to create navigable 3D displays and animations, even for those with limited programming experience. Because it is based on Python, it also has much to offer for experienced programmers and researchers.
An open-source, freely available software system for 3D computer graphics, image processing and visualization. VTK consists of a C++ class library and several interpreted interface layers including Tcl/Tk, Java, and Python. VTK supports a wide variety of visualization algorithms including: scalar, vector, tensor, texture, and volumetric methods; and advanced modeling techniques such as: implicit modeling, polygon reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation. VTK has an extensive information visualization framework, has a suite of 3D interaction widgets, supports parallel processing, and integrates with various databases on GUI toolkits such as Qt and Tk.
An open-source, cross-platform system that provides developers with an extensive suite of software tools for image analysis. Developed through extreme programming methodologies, ITK employs leading-edge algorithms for registering and segmenting multidimensional data.
An effort to automate the language binding process of one of the largest highly template-oriented c++ libraries, the Insight Toolkit image processing library. Currently Python, Java and Tcl language bindings are implemented, but only Python is fully supported.
A document generator that translate a text file with minimal markup into over a dozen other formats.
A reliable UDP based application level data transport protocol for distributed data intensive applications over wide area high-speed networks. UDT uses UDP to transfer bulk data with its own reliability control and congestion control mechanisms. The new protocol can transfer data at a much higher speed than TCP does.
The Unified Form Language is an embedded domain specific language for definition of variational forms intended for finite element discretization. More precisely, it defines a fixed interface for choosing finite element spaces and defining expressions for weak forms in a notation close to mathematical notation.
The underling library provides simple, scalable means to manipulate MPI-parallel, three dimensional pencil decompositions using FFTW. Pencil decompositions are a natural way to distribute O(n3) data across O(n2) processors and are well-suited for memory-intensive, structured spectral turbulence simulations and postprocessing codes. It may be useful in other domains as well. The library is written in C99 and may be used by C89 or C++ applications.
An adaptive finite element solver for fluid and structure mechanics. Unicorn aims at developing one unified continuum mechanics solver for a wide range of applications, based on the suite DOLFIN/FFC/FIAT.
http://uv-cdat.llnl.gov/ with multiple features to address the most common tasks performed by scientific researchers in the publication and spreading of their results.
The Visualization and Analysis Platform for Ocean, Atmosphere, and Solar Researchers provides an interactive 3D visualization environment that runs on most UNIX and Windows systems equipped with modern 3D graphics cards.
A program for the analysis of ocean tide data.
A generic C++ library for data assimilation. It aims at providing methods and tools for data assimilation. It is designed to be relevant to a large class of problems involving high-dimensional numerical models. Verdandi provides a Python interface generated by Swig.
Very Fast Machine Learning is a toolkit for mining high-speed data streams and very large data sets. VFML is made up of three main components. The first is a collection of tools and APIs that help a user develop new learning algorithms. The second component is a collection of implementations of important learning algorithms. The third component is a collection of scalable learning algorithms.
VFML provides code to help read and process training data, to gather sufficient statistics from it, ADTs for several important machine learning structures, and various helper code. You can get an overview of what is provided by visiting the Core APIs and Utility APIs sections of the documentation.
VFML contains a series of tools for working with data sets: cleaning them, sampling them, splitting them into train/test sets. It also has tools to help you experiment with learning algorithms. See the Other Tools documentation heading for more information.
VFML contains tools for learning decision trees, for learning the structure belief nets (aka Bayesian networks), and for clustering. Much of this code is easy to modify or extend (several other researchers have benefited from the bnlearn program, for example), and much of it can scale to learning from very large data sets or from data streams. You can get an overview of all the learners by checking out the Learning Programs section.
A cross-platform visualization and analysis application for atmospheric data. The application uses the Python language as the means through which you provide commands to the application. The Python interfaces for CODA and BEAT-II are included so you can directly ingest product data from within VISAN. Using the Python language and some additional included mathematical packages you will be able to perform analysis on your data. Finally, VISAN provides some very powerful visualization functionality for 2D plots and worldplots.
An interactive parallel visualization and graphical analysis tool for viewing scientific data. Users can quickly generate visualizations from their data, animate them through time, manipulate them, and save the resulting images for presentations. VisIt contains a rich set of visualization features so that you can view your data in a variety of ways. It can be used to visualize scalar and vector fields defined on two- and three-dimensional (2D and 3D) structured and unstructured meshes. It was designed to interactively handle very large data set sizes in the terascale range, and works well down to small data sets in the kilobyte range.
An open-source scientific workflow and provenance management system that provides support for simulations, data exploration and visualization.
Adds spatial and temporal data access, data pre-processing and data analysis capabilities to VisTrails, the Python scientific workflow tool. This includes cloud, analytics and standards-based spatial data access. Current data modules allow access to OGC Web Services, such as WFS, WMS, SOS, as well as other spatial data sources, including PostGIS databases and netCDF datacubes. Additional modules provides wrappers around standard R, Octave and SQL scripts for data analysis capabilities. Spatial data can be visualised using a mapping module (via the QGIS API), as well as more conventional plots, for example, for time-series and windrose data.
A Python library for visualizing 1-D to 4-D data, i.e. an object-oriented layer of Python built on top of OpenGL.
The Velocity Mapping Toolbox (VMT) is a Matlab-based software for processing and visualizing ADCP data collected along transects in rivers or other bodies of water. VMT allows rapid processing, visualization, and analysis of a range of ADCP datasets and includes utilities to export ADCP data to files compatible with ArcGIS, Tecplot, and Google Earth. The software can be used to explore patterns of three-dimensional fluid motion through several methods for calculation of secondary flows (e.g. Rhoads and Kenworthy, 1998; Lane et al., 2000). The software also includes capabilities for analyzing the acoustic backscatter and bathymetric data from the ADCP. A user-friendly graphical user interface (GUI) enhances program functionality and provides ready access to two- and three- dimensional plotting functions, allowing rapid display and interrogation of velocity, backscatter, and bathymetry data.
A collection of command-line tools for researchers in machine learning, data mining, and related fields. Waffles seeks to be the world’s most comprehensive collection of command-line tools for machine learning and data mining. Our native tools have minimal dependencies (no interpreter, VM, or runtime environment is necessary), and build cross-platform.
An automated analysis code generator.
A web-based visualization platform designed to enable visualization of any available data by anyone for any purpose. Weave is an application development platform supporting multiple levels of users – novice to advanced – as well as the ability to integrate, analyze, and visualize data at "nested" levels of geography, and to disseminate the results in a web page.
The Wolfram Language is a highly developed knowledge-based language that unifies a broad range of programming paradigms and uses its unique concept of symbolic programming to add a new level of flexibility to the very concept of programming.
A flexible and extensible open-source code for (mostly) Discrete Element (DEM) simulations. It is geared towards non-trivial, challenging scenarios — sieving, segregation, conveyors, membranes: you name it.
Online collaborative LaTeX editor with integrated rapid preview.
X10 is a modern language in the strongly typed, object-oriented programming tradition. Its design draws on the experience of team members with foundational models of concurrency, programming language design and semantics, type systems, static analysis, compilers, runtime systems, virtual machines. Our goals were simple - design a language that fundamentally focuses on concurrency and distribution, and is capable of running with good performance at scale, while building on the established productivity of object-oriented languages. In this, we sought to span two distinct programming language traditions - the old tradition of statically linked, ahead-of-time compiled languages such as Fortran, C, C++, and the more modern dynamically linked, VM based languages such as Java, C#, F#. X10 supports both compilation to the JVM and, separately, compilation to native code.
A two-dimensional model for wave propagation, long waves and mean flow, sediment transport and morphological changes of the nearshore area, beaches, dunes and backbarrier during storms.
The need for a standardized method to exchange scientific data between High Performance Computing codes and tools lead to the development of the eXtensible Data Model and Format (XDMF). XDMF uses XML to store Light data and to describe the data Model. HDF5 is used to store Heavy data. The data Format is stored redundantly in both XML and HDF5. This allows tools to parse XML to determine the resources that will be required to access the Heavy data. While not required, a C++ API is provided to read and write XDMF data. This API has also been wrapped so it is available from popular languages like Python, Tcl, and Java. The API is not necessary in order to produce or consume XDMF data. Currently several HPC codes that already produced HDF5 data, use native text output to produce the XML necessary for valid XDMF.
An automatic wrapper generator for C/C written in pure Python. Currently, xdress may generate Python bindings (via Cython) for C classes & functions and in-memory wrappers for C++ standard library containers (sets, vectors, maps).
XML-IO-SERVER is a library dedicated to I/O management of climate code.
An XML-based simulation authoring environment. The proposed description language allows to describe mathematical objects such as systems of ordinary differential equations, systems of non-linear equations, partial differential equations in two dimensions, or simple curves and surfaces. It also allows to describe the parameters on which these objects depend. This language is independent of the software and allows to ensure a relative perennity of authors work, as well as collaborative work and content reuse.
A general purpose storage system and covers most storage needs in a single deployment. It is open-source, requires no special hardware or kernel modules, and can be mounted on Linux, Windows and OS X. XtreemFS is easy to setup and administer, and requires you to maintain fewer storage systems.
A Python translator framework designed to create source to source translators, code analysis tools or just to teach compiler technology without the need of learning large pieces of code.
An extensible open-source framework for discrete numerical models, focused on Discrete Element Method. The computation parts are written in c++ using flexible object model, allowing independent implementation of new alogrithms and interfaces. Python is used for rapid and concise scene construction, simulation control, postprocessing and debugging.
An Yorick is an interpreted programming language for scientific simulations or calculations, postprocessing or steering large simulation codes, interactive scientific graphics, and reading, writing, or translating large files of numbers. Yorick includes an interactive graphics package, and a binary file package capable of translating to and from the raw numeric formats of all modern computers. Yorick is written in ANSI C and runs on most operating systems. Yorick has a compact syntax, similar to C, but with array operators. It is easily expandable through dynamic linking of C libraries, allows efficient manipulation of arbitrary size/dimension arrays, and offers extensive graphic capabilities.
A community-developed analysis and visualization toolkit for astrophysical simulation data. yt runs both interactively and non-interactively, and has been designed to support as many operations as possible in parallel.
A WPS platform.
An array programming language designed from first principles for fast execution on both sequential and parallel computers. It provides a convenient high-level programming medium for supercomputers and large-scale clusters with efficiency comparable to hand-coded message passing. It is the perfect alternative to using a sequential language like C or Fortran and a message passing library like MPI.