In this session we will discuss the problems that Data Engineering at the Department of City Planning encountered managing datasets and introduce the open source tooling that we’ve built to manage metadata, generate documentation, enforce data quality, and automate distribution of data to platforms, with a focus specifically on NYC Open Data.

This talk is aimed primarily at those who have an interest in automating some or all of the above. We will walk through how we, at City Planning, catalog our dataset metadata; how that metadata is used to generate READMES, data dictionaries, and other metadata files; how to leverage metadata for automated QA; how to automate distribution of data to destinations like the Tyler/Socrata open data platform, databases, FTP servers, and data lakes; and finally how interested developers can make use of our framework and potentially contribute code of their own.

This presentation is part of the Open Data @ NYC Planning event series.

Click here to RSVP for virtual attendance.

Click the blue “Going” button below to RSVP for in-person attendance at the Department of City Planning’s offices (120 Broadway, New York, NY 10271).

As data analysts and engineers know, quality source data is crucial to sound analyses and healthy pipelines. It can also take a lot of time, effort, and resources to wrangle. Data Engineering at the Department of City Planning has written a new tool (python module/CLI) to manage data extraction and archival.

In this session, we will show potential users how they might simplify and automate extracting data from external sources like NYC Open Data and ArcGIS Online. We will touch on some of the built-in features of the tool as well as where we’re going: simple data cleaning, automatic geocoding, data validation, and more.

This presentation is part of the Open Data @ NYC Planning event series.

Click here to RSVP for virtual attendance.

Click the blue “Going” button below to RSVP for in-person attendance at the Department of City Planning’s offices (120 Broadway, New York, NY 10271).

Únete a este evento para explorar el ecosistema editorial en español de Nueva York a través de una perspectiva de datos. Esta sesión interactiva presentará los hallazgos iniciales de nuestro mapeo de editoriales, librerías, bibliotecas y autores que trabajan en español, destacando tanto los datos existentes como las brechas críticas de información. Dirigida por Juan Pablo Marín Díaz (Datasketch) y Viviana Castiblanco (editora), examinaremos cómo una mejor recolección y compartición de datos podría fortalecer las conexiones entre autores, editores y lectores en la comunidad literaria hispana de NYC.
La sesión combinará visualización de datos con conocimientos de la industria, presentando perspectivas de editores, libreros y autores en español basados en NYC. Los participantes tendrán la oportunidad de contribuir a identificar áreas prioritarias para futuras recolecciones de datos y discutir enfoques colaborativos para construir conjuntos de datos abiertos más completos sobre la publicación en español en NYC.
Este evento es ideal para editores, libreros, bibliotecarios, autores y cualquier persona interesada en la literatura en español o datos culturales. Ya sea un profesional editorial que toma decisiones con visibilidad limitada, un entusiasta de los datos interesado en aplicaciones culturales, o un miembro de la comunidad involucrado en la literatura en español, únase a nosotros para ayudar a mapear y fortalecer este ecosistema cultural vital.

This event will be held in Spanish. Attendance is limited, and registration is required to confirm your place. Only registered guests will be admitted.

DataKind started 2024 with an ambitious goal: to create an open-access tool populated with data at a hyperlocal level that would foster a deeper understanding of community needs and the complex factors influencing them. Working with collaborators representing many facets of health and wellbeing, DataKind launched the community health indicators software, a tool which enables social service providers, practitioners, policymakers and community stakeholders to access standardized data on their communities of impact and take meaningful action. This resource harmonizes and makes accessible 11 different public data sources of community data at three geographic levels (tract, zip, and county), with 49 unique data indicators drawing from a database of six million rows of data. Users have the option to create a free, secure individual or team account and upload and analyze additional datasets alongside the data included. Users can export analysis, including maps, from the system. This free and open software makes data insights accessible to any end-user regardless of data maturity.

This session will discuss the community-centered software design process, demonstrate the software with several use cases, and offer a rich, facilitated conversation with end-users of the software from social impact and governmental institutions. We will provide an introduction to question formulation, asset mapping, and data interpretation.

Attendees can submit their data questions here for use in the session or for future follow-up: https://forms.gle/TsWfe4Dses3jzVHm8

This presentation, hosted by the NYC Office of Management and Budget, will take participants through the journey of building a centralized data pipeline on a generic cloud platform to deliver accurate, consistent, and timely insights automatically! We will walk through step-by-step the entire lifecycle of data management, from raw data ingestion and cleaning to transforming it into a processed and standardized dataset that serves as the foundation for consistent and accurate reporting across the organization.

We will also focus on the general motivations behind the automation process and critical data decisions made during the cleaning process. And going from the general to the specific, we will show how we have set up a data workflow that allows us to run automatic reporting, both in the form of a dashboard and regular emails, using the City’s 311 data.  In addition, we will talk about cloud data storage, cloud computing, and other modern digital tools we used.

OpenStreetMap is the world’s largest volunteer-driven spatial dataset, relied on by millions of people around the world. But it’s more than one big map! Need to keep track of every playground, defibrillator, storm drain, or LGBT-friendly bar in your neighborhood? OpenStreetMap lets you leverage the power of crowdsourcing to fill data gaps left by commercial and government datasets. You can manage your data all in one place for the benefit of the entire OSM community.

Join OSM expert Quincy Morgan and civic innovation specialist Jazzy Smith to learn how organizations can use OSM as a free, collaborative GIS platform to meet their geodata needs. We’ll cover OSM basics, introduce available resources and workflows, review case studies of groups leaning into OSM, and look at how the Mapping for Equity project is connecting New Yorkers to their data.

This talk is for those of all skill levels interested in open map data. Come with questions! We’ll leave plenty of time for discussion.

Too often, the data that define our arts and culture sector fail to reflect its full diversity, leaving smaller and BIPOC-led organizations struggling to fit into rigid frameworks that overlook the depth and nuance of their impact. Open data—the practice of making datasets publicly accessible to increase transparency, accessibility, and innovation—has the potential to create a more equitable and informed arts ecosystem. However, without critical oversight, it can just as easily reinforce existing inequities rather than dismantle them.
If you’re an artist, cultural worker, organizer, advocate, funder, or policymaker concerned about how data shapes (or distorts) the narrative of our sector, join us for a candid panel discussion on the state of open data in NYC’s arts and culture landscape. We’ll unpack the realities of data collection and lay the groundwork for a collaborative effort to develop an Open Data Ecosystem that truly reflects the power and diversity of our cultural communities.
This event will feature a presentation of a recent study by the Culture & Arts Policy Institute, exploring the challenges and opportunities of leveraging open data to strengthen the cultural sector, enhance data literacy, and promote best practices across the city.

RSVP: https://www.eventbrite.com/e/data-power-and-justice-the-state-of-open-data-for-culture-arts-in-nyc-tickets-1277978611429

Join this event to learn from NYC Emergency Management (NYCEM) about their role coordinating citywide emergency planning and response for all types and scales of emergencies, and how they use 311 data in both the response and recovery cycles of a disaster.
This virtual presentation will explore:
– The types of reporting products produced at NYCEM for different disaster cycles.
– A technical overview of our data collection process and data pipeline.
– The important role 311 data plays in different cycles of an emergency.

NYC School of Data is a community conference that demystifies the policies and practices around open data, technology, and service design. This year’s conference helps conclude NYC Open Data Week and features 30+ sessions organized by NYC’s civic technology, data, and design community! Our conversations and workshops will feed your mind and inspire you to improve your neighborhood.

To attend, you need to purchase tickets. The venue is accessible, and the content is all-ages friendly! If you have accessibility questions or needs, please email us at schoolofdata@beta.nyc.

Thank you to Reinvent Albany and Esri for helping to cover conference costs and making it possible to meet in 2025.

And If you can’t join us in person, tune into the main stage live stream provided by the Internet Society New York Chapter. Follow the conversation #nycsodata on Bluesky.

Purchase your tickets here.

The WeGovNYC Databook (https://databook.wegov.nyc/) is a data pipeline that indexes, normalizes, and republishes over 50 NYC Open Data datasets into a single interface that offers in-depth profiles of City agencies, public schools, civil service titles and more.

During this session, Devin Balkind of WeGovNYC will review how the Databook’s data pipeline works, give a tour of the interface, talk about some recent FOILing, share plans for integrating MTA data, and discuss their next-generation open data stack that will make it much easier for people to build data products using transformed data.