A week sailing on Maersk Lirquen

In this post, I am sharing some facts about my first trip on a container vessel. The subject of this trip(as the title claims) was sailing on a 300m container vessel for 7 days, from Singapore to Busan(South Korea). The purpose of the trip was to collect inertial data which will be sourced as measurement inputs into a nonlinear model I have built, for the purpose of my PhD project. It is also useful to mention that the main interest of the project was on the inaccurate measurements of the speed log and underlines how one can avoid such issues with the lowest possible investment.

Parsing Reddit Data with Dask

This post is a step-by-step data exploration on a month of Reddit posts. It includes an AWS Amazon Server setup, a Pandas analysis of the Dataset, a castra file setup, then NLP using Dask and then a sentiment analysis of the comments using the LabMT wordlist. Lastly there is a WordCloud setup. All in Python language. Enjoy!

Implementing the DBSCAN clustering algorithm in Python

In this post I describe how to implement the DBSCAN clustering algorithm to work with Jaccard-distance as its metric. It should be able to handle sparse data.

Parsing data using the UNIX terminal

This is an example of how we can do simple data parsing using the Unix terminal. We will use applications like chmod, find, grep, sed, apt-get, nano, cut, sort, uniq, head, tail, less, cat, ssh, wc, echo, man, awk and more.

Visualization of Titanic Disaster using D3

In an attempt to experimentize with the visualization tools (Dimple.js, D3.js, visualization design principles, visual encodings, HTML, CSS, SVG) I created a polished data visualization that tells a story about survival rates of Titanic disaster, allowing a reader to explore trends or patterns.

Overview