The first version of an in-browser schema generator for ProxyML is now available! The page is live, but you can also just download the HTML and run it directly yourself if you’d prefer.
ProxyML trains your surrogates are on synthetic data generated from descriptive statistics about your data (min, max, etc.). You share these stats with ProxyML in a schema: your data never leave your server.
The Python SDK includes schema generation functionality, but if you’d prefer to use curl or you prefer to code in another language manual schema generation can be a little tedious. The schema generator gives you the same head start you’d get from the Python SDK, and from there you can make minor adjustments to the schema to fit your use case.
Here’s a quick demo of the page being used to create a schema for the basic features in the Titanic dataset.