Hua – Page 7 – Huafeng (Hua) Zhang

Browse Author by Hua

How will the GDPR impact machine learning? – O’Reilly Media

This article aims to demystify this intersection between ML and the GDPR, focusing on the three biggest questions I’ve received at Immuta about maintaining GDPR-compliant data science and R&D programs. Granted, with an enforcement data of May 25, the GDPR has yet to come into full effect, and a good deal of what we do know about how it will be enforced is either vague or evolving (or both!). But key questions and key challenges have already started to emerge.

Read the story >>

List of Google Easter eggs – Wikipedia

Easter eggs are hidden features or messages, inside jokes, and cultural references inserted into media. They are often well hidden, so that users find it gratifying when they discover them, helping form bonds between Google and its users. The easter eggs are sometimes created by employees during their 20% time. Google avoids adding easter eggs to popular search pages, as they do not want to negatively impact usability.[5][6]

Read the story >>

Mary Meeker’s 2018 internet trends report: All the slides, plus analysis – Recode

This year, the Kleiner Perkins Caufield & Byers partner released 294 slides in rapid succession, covering everything from smartphone behavior in the U.S. to tech company competition in China.

Read the story >>

ETL vs ELT: The Difference is in the How

the ELT approach provides a modern alternative to ETL. However, it’s still evolving. Therefore, the frameworks and tools to support the ELT process are not always fully developed to facilitate load and processing of large amount of data. The upside is very promising – enabling unlimited access to all of your data at any time and saving developers efforts and time for BI users and analysts.

Read the story >>

How Random Forest Algorithm Works in Machine Learning

The decision tree is a decision support tool. It uses a tree-like graph to show the possible consequences. If you input a training dataset with targets and features into the decision tree, it will formulate some set of rules. These rules can be used to perform predictions.

Read the story >>

xkcd: Clickbait-Corrected p-Value

Clickbait-Corrected p-Value

Read the story >>

Review: The Black Swan by Nassim Nicholas Taleb | Books | The Guardian

Whenever someone tells you not to go to the cause, it is worth heading straight there out of bloody-mindedness. Taleb tells us about an Italian professor who maintains that Taleb could not have come to such conclusions about risk if his background was a Protestant society in which work and reward were linked as cause and effect.

Read the story >>

The Random Forest Algorithm – Towards Data Science

Random Forest is a flexible, easy to use machine learning algorithm that produces, even without hyper-parameter tuning, a great result most of the time. It is also one of the most used algorithms, because it’s simplicity and the fact that it can be used for both classification and regression tasks. In this post, you are going to learn, how the random forest algorithm works and several other important things about it.

Read the story >>

Statistics for people in a hurry – Towards Data Science

Making decisions based on facts (parameters) is hard enough as it is, but -curses!- sometimes we don’t even have the facts we need. Instead, what we know (our sample) is different from what we wish we knew (our population). That’s what it means to have uncertainty.

Read the story >>

A Brief Overview of Outlier Detection Techniques – Towards Data Science

“Observation which deviates so much from other observations as to arouse suspicion it was generated by a different mechanism” — Hawkins(1980)

Read the story >>