Vertica is a very powerful analytics database, and security is important! You might need to store sensitive data, like SSN, but you don't want the SSNs available to anyone who can see the table on the database. To provide an additional layer of security, we can encrypt the SSN itself …
Recent Posts
-
-
Analyzing Strava metadata
I love running, and I love stroller running with my son even more. Strava is my go-to fitness app and I've tagged all of my stroller runs with a searchable tag so I can count the miles we've logged together, mostly while he has slept!
The search functionality on Strava's …
-
Presque Isle Marathon
PR Baby! Boston bound for 2020.
-
Exploring the science behind the Yasso 800
If you haven't heard of them, Yasso 800s comprise a infamous running workout that touts itself to predict your marathon time. The name was coined by Amby Burfoot, paying homage to Runner's World editor Bart Yasso.
While even Yasso himself has professed he had no idea why the math worked …
-
Developing Python on Vertica
Vertica is a very powerful analytics database, and we can easily extend functionality now by building in Python functions. This is great and all, so here I'll focus on setting up a development environment for building a simple UDx.
The documentation from Vertica is not super specific, so this may …
-
Boston Marathon 2019
What a day! Congrats to all the finishers. I count myself very grateful to be at the start line of this one, with my wife at 39 weeks pregnant. Didn't have the run that I had hoped out there, wrote some checks that my legs couldn't cash with a first …
-
Scoring arbitrarily large datasets with Pandas + Sklearn
-
Writing LaTeX in Atom
Atom is a code editor. The defaults try to "complete" words from your writing, and don't highlight spelling. After many months of using a code editor to write, it's clear that I've gone backwards and I should at least be using spell checking!
These two settings vastly improve the latex …
-
Tricks for coercing Pandas into parquet
For coercing pandas date times (stored as numpy datetime):
for col in df.columns[df.dtypes == np.dtype('<M8[ns]')]: # https://stackoverflow.com/questions/32827169/python-reduce-precision-pandas-timestamp-dataframe # apply(lambda x: x.replace(microsecond=0)) df[col] = df[col].values.astype('datetime64[s]')
For coercing python datetime (here, a datetime.date, there …
-
Some notes about running D3 inside Jupyter