PingThings is the data layer for physical infrastructure. Built on a decade of research at UC Berkeley. Deployed across utilities, federal labs, and industrial operators. Engineered for the data your sensors actually produce, not the schema your historians can tolerate.
Each capability gets a chapter: what it is, what physical systems require from it, and what the platform makes possible. Read in order, or jump to the one that matches your situation.
Your fleet runs sensors from a dozen vendors across decades of vintages, sampling at rates from quarterly meter reads to megahertz waveforms, communicated over protocols that didn't exist when your historian was specified. The platform was built for that reality.

PMUs, digital fault recorders, point-on-wave instruments, SCADA RTUs, AMI smart meters, power quality monitors, relay event loggers, BMS sensors, vibration sensors, thermal cameras, weather stations, satellite-derived environmental data. The platform ingests them at native sample rate, in their native protocol, without forcing translation at the edge.
No replatforming. No data cleansing as a precondition. No schema harmonization at ingest. The platform takes what your fleet actually produces and resolves the rest at query time.
Sensor diversity stops being an integration problem and starts being a feature. New sensor classes onboard in hours rather than quarters. Acquisitions integrate without ripping out existing data infrastructure. Vendor changes don't trigger pipeline rewrites.
If your sensor produces a measurement, the platform stores it. The schema is your problem to solve later, or never.
AMI samples at fifteen-minute intervals. SCADA polls at one to four seconds. PMUs sample at 30 to 120 hertz. Point-on-wave instruments sample at megahertz. The platform handles all of these in one substrate, at native rate, without forcing the lowest common denominator.

A 15-minute meter read is enough for a billing reconciliation. It tells you nothing about a 200-millisecond inverter oscillation. A 1-Hz SCADA poll captures slow trends. It misses sub-cycle vegetation arcing. A megahertz waveform is overkill for asset trending and the only resolution that captures incipient discharge.
Your data layer's job is not to harmonize at the lowest rate. It's to preserve every phenomenon at the rate its physics requires.
Sample rate stops being a storage decision. Engineers can investigate at the resolution the question requires. AMI and PMU and waveform and point-on-wave coexist in the same query, time-aligned, queryable from years to microseconds.
The real world happens at its own speed. PingThings captures every detail.
Operational telemetry isn't one type of data. It's continuous waveforms and regular samples and irregular events and digital states and sparse measurements, often describing the same physical phenomenon from different vantage points. The platform stores and queries all five types together.

When a fault clears: the waveform spikes, the breaker opens, an alarm fires, the operator acknowledges, the dispatcher logs a note. Five different signal types, one physical event. Conventional architectures fragment them across waveform stores, time-series databases, event logs, state monitors, and ticketing systems. Engineers reconstruct events by querying four systems and reconciling timestamps by hand.
That fragmentation is a side effect of the storage layer, not a property of the data.
All five types share the same time-aware substrate. Cross-type queries are native. The waveform during a fault, the breaker state change, the alarm, the operator acknowledgment, and the manual reading two days later all show up in one query, in temporal order.
When the breaker opened, the waveform spiked, the alarm fired, and the operator clicked acknowledge. All of that is one event. The data layer should treat it that way.
The old path is familiar: vendor lock-in, forced migrations, brittle exports, and historical data loss. PingThings ends that cycle.

Open in that we use best-of-breed open source wherever it works: Grafana, Jupyter, Apache Arrow, Kubernetes, the Python data science stack. Proprietary only where we can do meaningfully better. Open in that the protocols are standards-based, the data formats documented, the APIs public. Open in that the storage engine is fast enough to actually move utility-scale data around when you need to. It's part of why PredictiveGrid gets called the missing historian for Databricks.
None of that is accidental. It's how the platform stays useful as your stack around it changes. You can run PredictiveGrid alongside an existing historian indefinitely. You can migrate at your own pace. You can take your data with you if you ever leave. None of that requires our permission.
Vendor consolidation stops being a strategic risk. Mergers and acquisitions stop being multi-year integration projects. Sensor vendor changes stop triggering historian replacements. The data substrate outlives the vendor relationships built on top of it.
An aging monopoly historian increased its price again. Don't trap yourself in the past.
Operational data exists in dozens of forms: live streams from substations, daily exports from billing systems, decades of archived COMTRADE files, weather datasets in NetCDF, satellite-derived environmental data in HDF5, and proprietary historian dumps no one knows how to read anymore. The platform takes all of it.

Most operators have twenty years or more of historical data. Most of it is unreachable. It's on tape, in proprietary historian dumps, in CSVs no one indexed, in formats whose specs were never written down. The decision to leave it inaccessible was made one project at a time, never deliberately.
That history is also where your AI training data lives, where your asset baseline lives, where the precedent for unusual events lives. It's worth recovering.
Bulk historical data ingests at production rates. New formats can be onboarded without modifying core platform code. The historical archive becomes queryable at the same latency as live telemetry. Twenty years of operational data stops being a sunk cost and starts being a corpus.
Twenty years of operational data sitting on tape doesn't have to stay there.
Sensors don't operate in isolation. They're attached to assets, located in geographies, tied to operating states, embedded in topologies. Without that context, raw measurements are just numbers. With it, they're investigations, predictions, and decisions.

A typical operator has dozens of system-of-record databases tracking which sensor is on which asset, which asset is in which substation, which substation is on which feeder, which feeder serves which customers. The catalogs are out of date by the time they're queried.
The platform makes the context part of the data, not a separate system that engineers reconcile by hand.
Spatial queries become trivial. Investigations span context layers automatically. Cross-asset analysis stops requiring three days of database joins. Engineers ask questions in operational language, not in database schema.
Ask "what was happening at the substation during the disturbance" and the answer should be one query, not three weeks of cross-referencing.
Storage is necessary. Storage is not enough. Most data layers stop at preservation: data lands, data sits, engineers export it elsewhere to analyze. The platform extends from ingest through visualization to operationalization, in one substrate, with no exports required.

Ingest. Preserve. Contextualize. Analyze. Visualize. Integrate. These are not separate products with their own data models. They are coherent layers of one platform, sharing one storage engine and one query path.
The distance between where the data lives and where the work happens is the distance most data architectures never close. The platform closes it by collapsing the layers into the same system.
Engineers analyze in place, in their notebook, against the live archive, at the resolution the question requires. ML pipelines train on full-resolution data without exporting petabytes. Operators see real-time and historical data in the same interface, queryable at the same latency.
Close the gap between data and action.
The data your sensors actually produce is full of late arrivals, dropouts, jitter, dynamic sample rates, sparse intervals, and corrections that arrive weeks after the original data. Every conventional pipeline breaks under these conditions. PingThings was designed for them.

In production, telemetry arrives late. Communications drop. Sensors recalibrate. Backfills come in days after the original window. Engineers correct old data and want both versions preserved. Sample rates change because someone reconfigured the device. Data arrives out of order because two paths converged.
Conventional pipelines treat all of this as exception handling. The platform treats it as normal operation.
Engineering teams stop maintaining ETL pipelines as their primary job. Real-world data quality stops blocking analysis. Corrections, enrichments, and backfills happen without breaking downstream queries or losing history.
Real telemetry isn't tidy. The platform doesn't pretend it is.
Every utility board wants AI. Most of their data is averages of averages, sampled at intervals coarser than the phenomena their models are trying to predict. PredictiveGrid was built for the resolution physical AI actually needs.

A model trained on 15-minute AMI averages cannot predict an inverter oscillation that develops in 200 milliseconds. A model fed downsampled SCADA cannot detect a vegetation contact arc that resolves in a sub-cycle event. Physical phenomena have characteristic timescales. If your data layer averaged through them, no amount of model architecture or training compute will recover what was lost.
Physical AI runs on signal, not summary. That's the prerequisite most data layers cannot meet. It's also the reason most utility AI initiatives stall in data preparation rather than model design: the training data the model needs is data that was never preserved.
PredictiveGrid preserves every cycle of every signal at native sample rate, time-aligned across heterogeneous sensor classes: PMU and AMI, SCADA and waveform, BMS and power quality, in the same query, at the same resolution they were captured. For ML and AI workflows, that translates into capabilities most teams have never had access to:
The model you train is only as good as the signal you preserved.
PredictiveGrid meets teams where the pain is: unexplained events, brittle historians, inverter-heavy operations, messy telemetry, and data that should be useful but is not yet actionable.
Show us the sensors, sample rates, formats, history, security constraints, and workflows that are giving you trouble. A few minutes is enough to know whether PredictiveGrid is the right substrate for your project.