Recent Posts

  1. Column level encryption on a Vertica Database

    Wed 08 January 2020

    Vertica is a very powerful analytics database, and security is important! You might need to store sensitive data, like SSN, but you don't want the SSNs available to anyone who can see the table on the database. To provide an additional layer of security, we can encrypt the SSN itself …

  2. Analyzing Strava metadata

    Mon 30 December 2019

    I love running, and I love stroller running with my son even more. Strava is my go-to fitness app and I've tagged all of my stroller runs with a searchable tag so I can count the miles we've logged together, mostly while he has slept!

    The search functionality on Strava's …

  3. Exploring the science behind the Yasso 800

    Sat 17 August 2019

    If you haven't heard of them, Yasso 800s comprise a infamous running workout that touts itself to predict your marathon time. The name was coined by Amby Burfoot, paying homage to Runner's World editor Bart Yasso.

    While even Yasso himself has professed he had no idea why the math worked …

  4. Developing Python on Vertica

    Fri 05 July 2019

    Vertica is a very powerful analytics database, and we can easily extend functionality now by building in Python functions. This is great and all, so here I'll focus on setting up a development environment for building a simple UDx.

    The documentation from Vertica is not super specific, so this may …

  5. Boston Marathon 2019

    Mon 15 April 2019

    What a day! Congrats to all the finishers. I count myself very grateful to be at the start line of this one, with my wife at 39 weeks pregnant. Didn't have the run that I had hoped out there, wrote some checks that my legs couldn't cash with a first …

  6. Writing LaTeX in Atom

    Fri 20 July 2018

    Atom is a code editor. The defaults try to "complete" words from your writing, and don't highlight spelling. After many months of using a code editor to write, it's clear that I've gone backwards and I should at least be using spell checking!

    These two settings vastly improve the latex …

  7. Tricks for coercing Pandas into parquet

    Tue 29 May 2018

    For coercing pandas date times (stored as numpy datetime):

    for col in df.columns[df.dtypes == np.dtype('<M8[ns]')]:
        # https://stackoverflow.com/questions/32827169/python-reduce-precision-pandas-timestamp-dataframe
        # apply(lambda x: x.replace(microsecond=0))
        df[col] = df[col].values.astype('datetime64[s]')
    

    For coercing python datetime (here, a datetime.date, there …

« Page 3 / 7 »