🌤

Modern Data Stack Accelerator

Book an office hour appointment to discuss if the Modern Data Stack Accelerator or other CalData service offerings are right for you.

Overview

The Modern Data Stack Accelerator (MDSA) is a practical approach to help departments rapidly adopt modern cloud-based data tools while working on a data challenge they want to solve.
Move from siloed data to integrated data in order to build trust in your organization's data
Your department may struggle to combine data across systems and/or continue to rely on painful and manual processes to do so. This manual effort costs time and diverts staff from higher value work. Or your department may be struggling with how to best acquire and use modern data tools.
The Modern Data Stack Accelerator will help empower your team to use modern data tools effectively. Our goal is to demystify what building a modern data stack means, including:
  1. 1.
    architecting and procuring data stack components,
  2. 2.
    creating a culture of data operations across a team,
  3. 3.
    developing repeatable, automated, and observable processes, and
  4. 4.
    staffing to maintain a modern data stack.
This service relies on advances in cloud data warehouses (e.g. Snowflake, Google BigQuery, Microsoft Azure, Redshift, etc) and complementary data services that make it easier, faster, and cheaper to provide data that is highly available, robust, and agile.
This accelerator starts with a real business problem where supporting data has traditionally been difficult to combine and/or clean up. Using modern data tools in our environment, combined with training and consultation, your team will learn how to transform and automate this process.
The goal of this service is to build your team’s capacity to use the modern data stack and be ready to move from demonstration to production. This is a new service and our team is looking for partners willing to experiment and learn with us.
Watch a brief 9 minute presentation on the Modern Data Stack Accelerator to hear more about this service
Visit our showcase to see an example of a modern data stack deployment

How this service can help your department

This service can help your department tackle difficulties with:
  • Combining multiple datasets efficiently and automatically for analysis and reporting
  • Automating manual data quality efforts for high visibility analyses or reporting
  • Moving, querying, and analyzing large datasets
  • Assessing or trying new data tools to see if they work for your before making an investment
Four panels with icons representing the preceding list of difficulties the Modern Data Stack Accelerator can help with. There is one panel for each difficulty on a dark blue background with the same text as in the above list items.

Who is a good fit for this service

This service requires participation from your data team, IT, and a program area. This is a holistic service that works across your department teams to empower modern data use.
This service is a good fit for organizations that have a real data question to address and:
  1. 1.
    have been struggling with developing trustworthy sources of data and are ready to try something new,
  2. 2.
    are open to consciously experimenting and learning together with CalData staff, and
  3. 3.
    are interested in or have already started investing in cloud data warehouses (like Snowflake, Google BigQuery, Microsoft Azure or Redshift to name a few)
Three panels representing the 3 listed items above. From left to right, first with an icon of a person pushing a boulder up a bar chart, the second a beaker and test tube, and third a database with a cloud behind it.
This service would NOT be a good fit for:
  • Programs looking to set up technology without their organization’s IT support. The Office of Data and Innovation (ODI) is helping you to identify solutions that can be embraced by the organization. Our service is to accelerate learning and adoption, not to provide ongoing support.
  • Organizations that want to outsource their data work. Adopting the modern data stack is more than just tools. While you can use systems integrators and contractors, the focus of this service is to empower state staff with improved access to data and modern tools to work with that data.
  • Organizations that don’t have authority to access or use the data in question. If you depend on data from another organization, that’s okay. You just need to be certain that it can be shared as part of the project. See the Interagency Data Exchange Agreement (IDEA) for more on data sharing agreements.
If you’re unsure about your fit for this accelerator, book office hours or send us your question. We’re happy to advise!

What this service includes

A process flow diagram ordered from left to right. Starting with discovery sprint, then onboarding and training, then development sprints, then project retro, and finally handoff. These are then described below in text.
The Modern Data Stack Accelerator follows a structured process that allows for iteration and learning.
The engagement (4-6 months) will follow the following structure:
Discovery Sprint
2 week period beginning with project kickoff and ending with signed project charter. Discovery may include interviews, data sample reviews, and workshops to refine project scope and analysis questions.
Onboard and Train
Staff will receive baseline training in core concepts around the modern data stack. These are designed to level set across the partner team.
Development Sprints
A number of 2 week periods of development. The specific number of sprints will be determined during scoping and planning.
Project Retro
A look back at the project to identify what worked and what needs improvement overall.
Handoff
Based on a plan developed with the department partner, ODI works with you on a handoff to move from demonstration to production in your environment.

Expectations of department partners

The MDSA service is designed to take an existing data challenge and build your team’s capacity to solve it with the adoption of modern data tools and processes. Our team will help you but not be the builders and maintainers of your data processes.
Below are the expectations departments will need to meet to use this service:
  • A team assembled by the department including the following staff roles (note: staff can fulfill multiple roles if appropriate):
    • Program, IT, and data leadership. The leads will champion the project and stay aware of the project as it progresses. These people should be involved before the application is submitted.
    • Project champion. An individual who can help keep the project moving. They can be trusted to raise issues as needed and work with the CalData team to address any project blockers. They will provide continuity and connection across teams and individuals during the accelerator.
    • Data stewards. Program and data staff that understand how the relevant data is collected, what it means, and its importance to the program.
    • Data analysts. At least 2 staff to be trained in SQL-driven analytical work to take raw source data and transform it into usable datasets that support the program. Program and data staff need to be able to write and review code to enable a real culture of data operations.
    • Data custodians. The technical staff that secure and maintain the systems that contain source data relevant to the project.
    • Data engineers or equivalent. Staff that should be proficient in SQL as well as literate in cloud tooling and a scripting language like Python. These staff will be trained in writing and maintaining data pipelines.
If you’re unsure if you have coverage of these roles or any clarifications, book an office hour or send us your question. We’re happy to advise and suggest alternatives that may work for all of us.
  • Leadership participation. The three lead roles identified above (IT, data, and program) should be active participants in the process. They will help shape the engagement at an initial meeting and must participate in regular progress meetings to ensure alignment.
  • Data analyst and engineering staff with relevant skills. Analyst and engineering staff should be comfortable querying databases with basic SQL select statements. Engineering staff with Python skills may also be needed depending on the project scope.
  • Dedicated staff time. The program consists of a mix of training and project work. This will involve a time investment spread out over 4-6 months. Not all staff will be needed at all times across the accelerator timeline.
  • Data. Departments will need to identify data with business/program impact and will need to have the ability and authority to share the data with CalData staff.
  • IT partnership (or participation). This program is designed for partnership, but ideally, participation with the department’s IT team.
  • Presenting and disseminating. Part of our goal is to document and communicate what we learn and accomplish so other departments and jurisdictions can benefit. We will need input and feedback and some participation from departments in this process.