NYC School of Data is a community conference that demystifies the policies and practices around open data, technology, and service design. This year’s conference helps conclude NYC Open Data Week and features 30+ sessions organized by NYC’s civic technology, data, and design community! Our conversations and workshops will feed your mind and inspire you to improve your neighborhood.

To attend, you need to purchase tickets. The venue is accessible, and the content is all-ages friendly! If you have accessibility questions or needs, please email us at schoolofdata@beta.nyc.

Thank you to Reinvent Albany and Esri for helping to cover conference costs and making it possible to meet in 2025.

And If you can’t join us in person, tune into the main stage live stream provided by the Internet Society New York Chapter. Follow the conversation #nycsodata on Bluesky.

Purchase your tickets here.

A school “colocation” occurs when two or more public schools share the same school building or campus. They have long been a part of the city’s public schools, but increased rapidly under the Bloomberg and subsequent administrations, mirroring the rise of charter schools and the small schools movement.

In this session, we present a look at the state of colocations in our schools through the lens of open data. We look at colocations from the perspective of changing neighborhood demographics represented in US Census data, as well as school demographic and academic data gathered from NYC Open Data datasets. We consider how and when schools thrive as colocations and when they suffer or present inequities.

During our presentation, we will discuss the background and driving questions for our research and our findings, but we will also demonstrate our methods and approach (and code) used to work with open geospatial data.

We follow our presentation with a workshop demonstrating new ways to plot overlapping spaces on data driven maps using the Python programming language. Our team will work with participants to code their own maps that investigate various aspects of school buildings and colocations. Participants of all levels are welcome.

You are invited to celebrate and document the vibrant urban ecologies of New York City in an engaging edit-a-thon event. This special day is dedicated to contributing to open data projects and expanding the digital landscape of urban agriculture through collaboration with experienced WikiNYC editors and open data enthusiasts. Whether you are a seasoned Wikipedia editor or new to the process, your knowledge and stories are welcome. Join us to help preserve and amplify the impact of these cherished green spaces!

Hosted By: Farming Concrete, Wikimedia NYC, Prime Produce, Seeds to Soil, Cafe 242 Hell’s Kitchen

Introduction to Wikipedia and Open Data Editing: Learn how to edit Wikipedia and contribute to open data platforms with guidance from experienced editors and open data specialists. Projects will include:
– Open Street Map + Farming Concrete: Mapping community gardens and urban green spaces.
– WikiData and WikiBase: Adding and enhancing data about community gardens, land ownership, and urban environments.
– WikiSpore: Exploring creative documentation and storytelling.
– Open Source Flexibility: If you are interested in editing other open-source projects or platforms not explicitly listed, there will be opportunities to explore and work on them with available support.

Documentation and Story Sharing: Share your firsthand experiences and narratives about community gardens and farms, linking them to credible primary and secondary sources.
Open Data Exploration: Collaborate on projects related to land ownership, urban heat data, and civic ecology.
Contribute Your Visuals: Bring photographs, videos, and other media that capture the essence and beauty of community gardens to enrich the visual narrative.

Together, we’ll document, map, and celebrate community gardens—putting them on the global digital map!

What to Bring:
– Articles of Interest: Publications or written materials relevant to urban agriculture and community gardens.
– Lists and Ideas: A list of community gardens or related projects you’re interested in documenting.
– Photographs and Media: Any media or documentation to help visualize these spaces.
– Reference Sources: Bring any citations or resources to enhance the content.
– Devices: Laptops, tablets, or other devices for hands-on editing.
– Excitement: Bring your passion for learning, sharing, and building community!

Contact and Questions:
If you have any questions or ideas to share, reach out to kellystgreen@gmail.com.

Join us as we share lessons learned from applying GenAI and Natural Language Processing (NLP) to alternative data sources! We’ll walk through a project where we used Public Pulse Mining to evaluate how the public engages with our the General Services Administration’s construction projects and better understand local stakeholder priorities and perceptions.

Then, we’ll dive into an interactive prompt engineering exercise using our master prompt templates for structuring unstructured data. You’ll gain practical takeaways on using AI for public engagement, including how to extract insights from free-text datasets like NYC public meeting YouTube transcripts, 311 feedback, and consumer complaints.

This session is open to all audiences, regardless of technical background. We’ll also share open-source tools and scripts on GitHub so you can apply these methods to your own datasets!

Urban street flooding presents significant challenges for metropolitan areas like New York City, particularly in the face of intense rain events. In this workshop, we will explore the causes and variability of streetflooding using NYC Open Data and machine learning techniques. Building on prior research, we aim to reproduce findings on flood risk factors while incorporating updated data from NYC 311 service requests data and other sources such as the U.S. Census. This approach will enhance our understanding of how socio-economic and infrastructural factors contribute to flooding, offering new insights into the spatial dynamics of flood risk.

The workshop will focus on three key topics: data cleaning to process NYC 311 flood reports and supplementary datasets, exploratory data analysis to identify patterns in flood risk factors, and predictive modeling using Random Forest regression. By analyzing how factors like land features, topography, and population dynamics influence flood risk, participants will gain hands-on experience with urban flood modeling techniques.

Four students from the Spring 2025 Introduction to Data Science course at UConn will present their projects in sequential order, each focusing on one aspect of the core topics. The presentations will be followed by a Q&A session, providing participants with an opportunity to engage with the presenters and explore the findings in greater depth.

  • How confidently can we predict the impacts of zoning change on housing supply?
  • Can we use AI to create novel datasets that may allow us to better understand housing phenomena?
  • What would it take to model a reality in which we build 1 million housing units?

These were some of the questions that led Janita Chalam, an independent researcher with a background in software engineering and machine learning, to begin their research journey into discovering how open data, statistical modeling, and AI can help us tackle the housing affordability crisis.

This presentation will walk through what Janita has learned about the variables at play in NYC’s housing landscape and present a statistical analysis of the Bloomberg-era upzonings as a case study in examining the frictions to building more housing in NYC.

Finally, Janita will propose some ideas for what kind of data and methodologies we might need in order to make bolder claims about what it takes to get us out of the housing crisis. By the end of this talk, we will hopefully have a better understanding of the role that data and empiricism can and should play in our conversations about housing policy.

This talk is for anyone interested in housing affordability and will not require any expertise in the technologies mentioned.

New York City agencies create and publish a huge volume of geospatial data each year. They use Geographic Information Systems (GIS) – computer-based tools to store, visualize, and analyze this geographic data. This panel will review publicly-available tools and datasets, discuss the state of GIS technology in the city, and consider how the City uses geospatial data to serve NYC residents.
Join this conversation with agency GIS leaders about new maps & tools, geospatial data, and initiatives for 2025.

Moderator – Lee Ilan, NYC Mayor’s Office of Environmental Remediation
Panelists
Josh Friedman, NYC Emergency Management
Matt Croswell, NYC Department of City Planning

Join our virtual event to discover a cutting-edge tool that pinpoints prime tree planting sites across NYC using NYC Open Data. This model empowers NYC Parks foresters to expand the City’s green canopy more efficiently. The model leverages datasets such as NYC Parks Tree & Site, NYC Planimetrics, DOT Traffic Signs, and the DOH Heat Vulnerability Index to drive site selection. Shawn Ganz, Data & Product Designer with NYC Parks, will break down the model’s purpose, methodology, and data foundations while advocating for greater open data access in urban planning. Wrap up the session with an engaging Q&A and discussion on future applications.

Have you ever wondered how parks in your neighborhood compare with others? Meet the Vital Parks Explorer! In this session, the Innovation & Performance Management (IPM) team from NYC Parks will share some highlights of how this public-facing tool was built. The Explorer visualizes access to parks amenities and services across the city. From inception and prototype, to public release, three data professionals working in local government will give you a behind-the-scenes look at how data, analysis, visualization, and user experience considerations shaped the final product. This event is for all New Yorkers who care about parks and might be of particular interest to advocates of public spaces, civic data enthusiasts, web app developers, designers, geospatial data scientists and engineers. We look forward to your participation and feedback!

Speaker bios:
Lilian Chin is a Data Analytics Specialist on Parks’ IPM team, where she has worked since September 2023. As part of IPM, she supports a wide-range of data-driven initiatives for Parks’ Maintenance and Operations. This includes visualizing the Work Order backlog, streamlining data pipelines for park assets, developing methodologies for the Park Condition Score, building in-house dashboards, and improving data quality and documentation.

Kate Sales is a Data Analytics Specialist on Parks’ IPM team. In the last year, she has worked on projects that touch many aspects of Parks including collecting and combining volunteer data for the Let’s Green NYC initiative, creating dashboards for Vital Parks for All, and helping others learn how to visualize data. Before Parks, Kate was a GIS analyst at the urban planning consulting firm Urban3 in Asheville, NC, her hometown. She recently completed her Master of Urban Planning at CUNY Hunter College and earned her BA in geography at Macalester College.

Benno Mirabelli is a Data Scientist on Parks’ IPM team. He works on various data analysis, reporting, research, and optimization projects. Some examples of his ongoing work include the routing analysis based on LION data for the recently released Vital Parks Explorer, research on understanding usership patterns and visitor volume at parks, and new management tools that assist with grass maintenance, planning, seasonal worker assignments and more. He holds a PhD in Applied and Computational Mathematics.