Data Generation Pipeline
Overview
The Data Generation Pipeline is a build-time process that transforms raw CSV datasets into a structured, type-safe TypeScript format. This centralized data source powers the search, filtering, and analytics dashboard across the entire application.
The pipeline handles:
- Parsing weekly CSV files.
- Converting currencies (USD to INR).
- Normalizing sector and stage categories.
- Generating unique identifiers for deep-linking.
Directory Structure
To be processed, raw data must be organized into weekly directories within the project. The generator scans these directories to compile the master list.
data/
├── raw/
│ ├── 2025-W47/
│ │ └── funding_data.csv
│ └── 2025-W48/
│ └── funding_data.csv
└── funding-data.ts <-- Generated Output
Configuration
Before generating data, you must configure the global currency conversion rate. This ensures consistency across the analytics and detail pages.
Currency Conversion
Edit config/currency.js to set the current exchange rate:
// config/currency.js
module.exports = {
rate: 83.50, // 1 USD = ₹83.50
date: "2025-11-30"
};
Execution
To update the application's data source, run the generation script from the root directory:
npm run generate-data
This script reads all CSVs in the raw directory, applies the conversion logic, and overwrites data/funding-data.ts.
Data Schema
The pipeline produces an array of Deal objects. When consuming this data in components, use the following interface:
interface Deal {
id: string; // Unique hash generated from company and date
company: string; // Name of the startup
amount: number; // Funding amount in Lakhs (e.g., 100 = 1 Crore)
stage: string; // Funding round (e.g., "Series A", "Seed")
sectors: string[]; // Array of industry sectors
investors: string[]; // List of all participating investors
leadInvestor: string; // Primary investor
date: string; // ISO Date string (YYYY-MM-DD)
location: string; // Headquarters city
description: string; // Brief overview of the company
sourceUrl?: string; // Link to news or official announcement
}
Consumption and Formatting
The system provides utility functions in @/lib/utils to handle the display of generated data.
Formatting Funding Amounts
The formatFundingAmount utility converts the numeric amount (stored in Lakhs) into a human-readable Indian Rupee (INR) format using Crores (Cr) and Billions (B).
import { formatFundingAmount } from "@/lib/utils";
// Example Usage
formatFundingAmount(100); // Returns "₹1Cr"
formatFundingAmount(50); // Returns "₹50L"
formatFundingAmount(100000); // Returns "₹10.00B"
formatFundingAmount(0); // Returns "Not Disclosed"
Handling Undisclosed Deals
In the pipeline, deals with unknown amounts are stored with a value of 0. Use isFundingDisclosed to filter these out during financial calculations:
import { isFundingDisclosed } from "@/lib/utils";
const disclosedDeals = fundingData.filter(deal => isFundingDisclosed(deal.amount));