Analytics

CAPM and Fama French 3 Factor model

Publish date: Feb 8, 2022
Summary: This post tests CAPM and Fama French 3 Factors model on both time series data and cross-sectional data. The latter performs much better and explain on average 91% variance for time series regressions and only around 20% for cross-sectional regressions.

Data Science I, Workshop I: Predicting interest rates at the Lending Club

Publish date: Oct 12, 2021
Summary: Lending Club is fintech company providing a platform for P2P lendings. In this post, we analysed a dataset from Lending Club and try to explain the pricing, aka interest rate, with demographical, economical and time information. R sqared can reach 93.81%.

Youth Risk

Publish date: Sep 13, 2021
Summary: Lead the youth in the right way.
Tags: R Data Analytics Data Visualization

Omega Group Salary

Publish date: Sep 13, 2021
Summary: Is there indeed a significant difference between the salaries of men and women, and whether the difference is due to discrimination or whether it is based on another, possibly valid, determining factor.

GDP Components

Publish date: Sep 13, 2021
Summary: Break down GDP for three countries (China, UK and US). Compare the differences in growth.

COVID-19 Public Use Data

Publish date: Sep 13, 2021
Summary:

Challenge 2: CDC COVID-19 Public Use Data

The CDC Covid-19 Case Surveillance Data is a case surveillance public use dataset with 12 elements for all COVID-19 cases shared with CDC and includes demographics, any exposure history, disease severity indicators and outcomes, presence of any underlying medical conditions and risk behaviors. The variables can be seen from:

There are well over 28 million entries of individual, and we will work with SQLlite database, rather than a CSV file. We will produce two graphs that show death % rate:


TFL Sharing Bikes

Publish date: Sep 11, 2021
Summary: Data of last 6 years.
Tags: R Data Analytics Data Visualization traffic

Survey about social media

Publish date: Sep 11, 2021
Summary: We analyzed data from the 2016 GSS sample data, using it to estimate values of population parameters of interest about US adults.
Tags: R Data Analytics Data Visualization society

Climate is changing

Publish date: Sep 11, 2021
Summary: Capture the evidence of temperature change from data.
Tags: R Data Analytics Data Visualization climate

Biden's approval

Publish date: Sep 11, 2021
Summary: As time goes by, the approval rate for Biden changes is going lower. The reason could be his attitude and actions regard Afghanistan. However, such oscillation happened to every U.S. president.

What kind of movie should I direct? :)

Publish date: Sep 4, 2021
Summary: IMDB analytics
Tags: R data analytics data visualization movie

Shallow financial data analytics

Publish date: Sep 4, 2021
Summary: A small step towards financial data analytics
Tags: R Data Analytics Data Visualization finance

HR Analytics

Publish date: Sep 4, 2021
Summary:

IBM HR Analytics

We analyse a data set on Human Resource Analytics. The IBM HR Analytics Employee Attrition & Performance data set is a fictional data set created by IBM data scientists. Among other things, the data set includes employees’ income, their distance from work, their position in the company, their level of education, etc. A full description can be found on the website.

First let us load the data:


Do you like drinking?

Publish date: Sep 4, 2021
Summary: Alchohol consumption analytics
Tags: Data Analytics

Who are voting for Brexit?

Publish date: Aug 28, 2021
Summary: Review the Brexit from a data aspect
Tags: R data analytics data visualization Brexit

What affects life expectancy?

Publish date: Aug 28, 2021
Summary: Life expectancy has increased a lot since the WWII. However, things are different between areas. Diseases and other issues are becoming more significant to human beings.
Tags: Data Analytics

Animal Rescue Data

Publish date: Aug 28, 2021
Summary:

Animal rescue incidents attended by the London Fire Brigade

The London Fire Brigade attends a range of non-fire incidents (which we call ‘special services’). These ‘special services’ include assistance to animals that may be trapped or in distress. The data is provided from January 2009 and is updated monthly. A range of information is supplied for each incident including some location information (postcode, borough, ward), as well as the data/time of the incidents. We do not routinely record data about animal deaths or injuries.


Tags: R data analytics data visualization animal