Skip to content

Resources for Data Science

Data (Open)

Boardgame Data

Data Mining, Data Wrangling, and Data Munging

Data Visualization and Storytelling

  • Data Visualization with JavaScript.
  • R ggplot2 package.
  • R ggvis package.
  • R shiny package.
  • Gephi.
    • Visualization and exploration software for all kinds of graphs and networks. Gephi is open-source and free.
  • Katy Börner.
  • Desislava Hristova.
  • LAB1100.com.
    • Independent research and software development firm. We built high quality applications driven by research questions, educational challenges and cultural encounters.
    • NODEGOAT.
  • ParaView.
    • Open-source, multi-platform data analysis and visualization application.
  • Periscopic.
    • Technology to visualize solutions that engage the public and deliver messages of action.
  • plotly.
    • Platform for agile business intelligence and data science.
  • MicroStrategy.com.
    • Build dashboards.
  • SAP Lumira.
    • Take control and connect to multiple data sources, big and small, such as SAP HANA, SAP Business Warehouse, Excel spreadsheets and more.
  • Weave.
    • Web-based analysis and visualization environment designed to enable visualization of any available data by anyone for any purpose.
  • Blocks (examples).
    • D3.js gallery.
  • Bokeh.
    • Python package for interactive and web-based data visualization.
    • Callbacks.
  • Panda3D.
    • Panda3D is a game engine, a framework for 3D rendering and game development for Python and C++ programs. Panda3D is Open Source and free for any purpose, including commercial ventures, thanks to its liberal license.
  • Ren’Py.
    • Ren’Py is a visual novel engine used by hundreds of creators from around the world that helps you use words, images, and sounds to tell interactive stories that run on computers and mobile devices.

GIS & Mapping

GUI & Interfaces

  • Kivy.
    • Open source Python library for rapid development of applications that make use of innovative user interfaces, such as multi-touch apps.

Image Processing

  • SimpleCV.
    • Computer Vision platform using Python.
    • Making your computer see things in the real world.
    • SimpleCV is an open source framework for building computer vision applications. With it, you get access to several high-powered computer vision libraries, such as OpenCV, without having to first learn about bit depths, file formats, color spaces, buffer management, eigenvalues, or matrix versus bitmap storage.
    • Book : Practical Computer Vision, O’Reilly.

Infographics

  • Piktochart.
    • Infographic maker.
  • Canvas.
    • Easy, drag-and-drop infographic creator.
  • Vizualize.me.
    • Create your infographic resume for free.
  • Google Charts.
    • Google chart tools are powerful, simple to use, and free; rich gallery of interactive charts and data tools.
  • easel.ly.
    • Créer et partager des idées de visuels .
  • infogr.am.
    • Create and publish beautiful visualizations of your data. Interactive, responsive and engaging.
  • Venngage.
    • verything you need to create and publish infographics is right here.

Kids and Coding

NLP

  • NLTK.
    • Natural Language Tool Kit for analyzing written text and writing things like spam filters and chat bots. NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an active discussion forum.

Online Programs, Courses, Lessons, and Tutorials

Online Reporting & Publishing

Parallel and Distributed Computing (Big Data)

  • Bases de données documentaires et distribuées.
    • Parallel computing, NoSQL, XML, MapReduce, distributed computing, indexation, Eleasticsearch, JSON, CouchBase, Pig, Spark, MongoDB, replication, scalability, cloud, virtual machine, markov chaine, page rank, Solr, CouchDB, XQuery,
  • Cloudera.
    • Hadoop ecosystem distribution.
  • Hortonworks.
    • Hadoop ecosystem distribution.

Python Web Frameworks

Statistics, Statistical & Machine Learning

Virtual Console and Virtual Coding

  • repl.it.
    • Everything you need to teach coding in your classroom.
    • Code in the cloud and interactive environment.
    • 30 languages.
  • Codeanywhere.
    • Cross platform cloud IDE.
  • R-Fiddle.
    • Environment to write, run and share R-code right inside your browser.
    • It even offers the option to include packages.
  • dataiku.
    • A collaborative data science platform
  • Heroku.
    • Platform for building with modern architectures.
    • Innovating quickly and scaling precisely to meet demand.

Visualizing and Inspecting the Code

  • Python Tutor.
    • Online; Python, Java, JavaScript, TypeScript, Ruby, C, C++.
  • PyDesk Visualizer.
    • PyDesk Visualizer is the desktop based Python Visualizer. It helps you to visualize Python Code. So one can easily understand execution of program.
  • bl.ocks.org.
    • Simple viewer for sharing code examples hosted on GitHub Gist.