Open Data is about to explode.
Data distribution demo from playground.fmeserver.com

Open Data is about to explode.

Following The Linux Foundation's announcement this week, open data has become more important than ever. Here's how to use it to your advantage, with tips for building your own open data portal.

This week at the Open Source Summit, The Linux Foundation announced the Community Data License Agreement. Built on concepts that have worked for ages with open software, this new licensing model is for sharing and collaborating on open data.

This is amazing news. The stats show it: public interest in open data is escalating. Check out this graph of open data downloads from Canadian Open Government Analytics.

The demand is there because access to more data means bigger potential for research, data-driven decisions, app development, non-profit ventures, machine learning (by definition, the machine needs massive amounts of data to learn from), and much more.

Now that demand can be met. The Community Data License Agreement, along with drag-and-drop solutions for data automation, make it easier than ever to create your own portal.

Consuming Open Data

First, find your data source.

Governments are the main providers. Most western governments are actually mandated to provide open data. You can find portals at the city, state, and federal levels, plus via intergovernmental organizations. Examples: check out the FME-hosted portals for the City of Surrey, Surrey Heath (England), and the City of Vancouver.

Non-governmental organizations and non-profits are now producing open data. Crowd-sourced open data and geomapping ensure time and money are spent where needed. For example, a crowd-sourced relief effort following the Nepal earthquake resulted in the mapping of thousands of miles of roads and tens of thousands of buildings, which enabled rescue plans.

Private companies are offering open data upon realizing the value of transparency. Adopting open data practices has helped many corporations improve net profits.

Academic institutes are also sharing data, including universities and scientific research organizations. Progress is being made towards transparent biomedical research, and already the Accelerating Medicines Partnership has collaborated on data in three disease areas.

Once you've found your open data provider of choice, you can begin to download and use it in your workflows. If you need to integrate open data with other sources, check for changes, transform or export it into a different format, or pull the data automatically (on a schedule, in real time, etc), you can use FME.

5 Tips For Creating Your Own Open Data Portal

Opening your data to the world is trivial: make a GeoCities page and put a link to a PDF file, and— Kidding. With a bit of planning, you can build a useful and aesthetically pleasing webpage that makes your data available to the world.

1. Update the datasets frequently

The problem with many open datasets is they don't get updated enough. Make sure your data is updated regularly. Provide your data as a published feed (e.g. RSS) or API rather than statically downloadable files. This will allow people to consume the endpoint, and if you make updates they will be automatically reflected in the user’s app.

You should also use automation (e.g. FME Server) and connect your portal directly to your master database, rather than duplicating data across two locations.

2. Offer coordinate system choices

For spatial data, offer more than one option for the coordinate system. Your users might want Spherical Mercator (EPSG:3857) for a web mapping application, or WGS84 lat/long (EPSG:4326) for GPS navigation systems, or a precise local projection like State Plane. Give them the freedom to pick, and offer both local and global projections.

3. Ensure the data is good quality

Ensure your data is good quality before making it public. This includes validating geometry, attributes, standards compliance, format-specific issues like XML / JSON structure, and more. Consult this data quality checklist for a thorough guide to geospatial data QA.

4. Offer format choices

Offer a choice with respect to format. Some recommendations:

  • GeoJSON. It’s flexible, machine readable, offers an API endpoint for the user, and is instantly viewable in a web environment.
  • XML. It’s machine readable and offers the user power and flexibility for tabular data.
  • JSON. Same reasons, plus it offers an API endpoint for the user.
  • CSV. It’s a tabular format that’s easily read by humans. Excel is also a good one.
  • Esri Shapefile because it’s such a widely used spatial data format. It’s consistently the most popular GIS format in FME usage stats.
  • KML. It’s instantly viewable in a web environment and is the format of choice for Google Maps and Earth.
  • Other useful spatial formats: GML (widely used OGC format), AutoCAD DXF/DWG (for the CAD users), Esri File Geodatabase (because Esri), MapInfo TAB (among the most popular GIS formats).
  • You could also offer PDF, because it looks nice and is easily shareable. Note this should be a supplementary format and not your central focus — because as Safe Software co-founder Dale Lutz likes to say, "PDF is where data goes to die."

5. Choose the right delivery solution

As for delivering the data, here are a few solutions you can leverage (alphabetically, not necessarily ordered by awesomeness). Check out this webinar and ebook to dive into more details.

  • ArcGIS Open Data
  • Amazon Web Services (AWS)
  • CKAN
  • DataPress
  • DKAN
  • FTP
  • GitHub
  • Junar
  • OpenDataSoft
  • Socrata

For a top-notch example, check out the City of Surrey, which won the Open Data for Democracy and the Canadian Open Data Excellence 2016 awards. They offer a vast range of data in an easy-to-navigate site, plus they let users draw a polygon on a map to pick the exact area to download.

I have no doubt open data will continue to rise in popularity. With the introduction of the Community Data License agreement and automation/cloud solutions readily available, it's more convenient than ever to create an open data portal. Massive amounts of data are being made available for sharing and collaboration this very moment — and it's going to be exciting to watch how the tech world changes as a result.

To view or add a comment, sign in

More articles by Tiana Warner

Others also viewed

Explore content categories