Archive for the ‘Projects’ Category

Analytics Expert at Wikistrat

Posted: March 30, 2016 in Projects

I recently was appointed to the Analytic Community Experts group for crowd sourced consultancy Wikistrat. Based in Washington DC, Wikistrat operates a global network of over 2,000 subject-matter experts working collaboratively via an online platform to help decision-makers identify solutions to complex strategic challenges. Here, I will offer my insights in data science, machine learning and big data to assist in building the wisdom that is essential for dealing with an increasingly complex world.

Wikistrat

insideBIGDATA_Guide_Streaming_Analytics_featureI’m excited to announce the availability of a new technology guide that I was contracted to research, develop and write – “insideBIGDATA Guide to Streaming Analytics” sponsored by Impetus Technologies, Inc.

Many enterprises find themselves at a key inflection point in the big data timeline with respect to streaming analytics  technology. There is a huge opportunity for direct financial and market growth for enterprises by leveraging this technology. The goal of this guide is to make sense of the vendor and technology landscape. It’s important to  choose a platform that will supply a proven and pre-integrated, performance-tuned stack, ease of use, enterprise-class reliability and flexibility to protect the enterprise from rapid technology changes.

You can download a copy of the guide HERE.

DemoDay2_DanI was pleased to present last evening at the Grid110 Demo Day hosted by The New Mart, LA’s premiere fashion mart in DTLA. Grid110 is a new start-up business accelerator in partnership with the Office of the Los Angeles Mayor. My topic was putting a new face on the LA apparel industry using data science methodologies. My analysis used data sets from the Los Angeles Open Data repository. I’m a big proponent of government open data resources with the goal of improving the lives of citizens in ever more data-driven cities.

My presentation is provide below. Check out the data visualizations especially the geospatial data analysis clusters showing business starts for the past 10 years across the various industry codes that constitute the LA apparel industry. Moving forward, I will be collaborating with Grid110 in 2016 to publish a new Fashion Tech industry report, develop a new Shiny app, and collect data points for a new Fashion Tech sector database. Exciting stuff!

 

Insider’s Guide to Apache Spark

Posted: November 9, 2015 in Projects
Tags:

insideBIGDATA_Guide_SparkI’d like to announce my new technology guide – An Insider’s Guide to Apache Spark on behalf of insideBIGDATA and sponsored by industry analytics leader TIBCO. The guide is a useful new resource directed toward enterprise thought leaders who wish to gain strategic insights into this exciting new computing framework. As one of the most exciting and widely adopted open-source projects, Apache Spark in-memory clusters are  driving new opportunities for application development as well as increased intake of IT infrastructure.

The guide includes the following topics:

  • An overview of Spark
  • Why Spark is so hot
  • Looking at Spark through a Hadoop lens
  • Spark SQL
  • The TIBCO–Spark connection

You can download my new Spark guide HERE.

I’m also taking part in an upcoming TIBCO webinar on Nov. 17 at 1:30pm ET. Click HERE to register.

iBD_screen_shot

bigdata_fashion_featureMy long affiliation with LA’s preeminent fashion mart – The New Mart, has been a fruitful one. This collection of over 70 high-end fashion showrooms is managed by a forward-thinking team that allowed me to engage methods of statistical learning to increase the reach of their many clothing lines through use of social media data sources. I built some cool technology to yield a weekly “Fashion top 10” that serves to drive The New Mart’s social media effort. Using sentiment analysis coupled with data sources like Twitter, Facebook, Instagram and fashion blogs, spreading brand awareness is approached in a strategic and focused manner.

insideBIGDATA Guide to Retail

Posted: September 9, 2015 in Projects, Uncategorized

insideBIGDATA_Guide_RetailI’d like to announce the availability of a new technology guide that I was contracted to research, develop and write — “insideBIGDATA Guide to Retail” sponsored by Dell and Intel. This guide is directed toward line of business leaders in conjunction with enterprise technologists with a focus on the above opportunities for retailers and how Dell can help them get started. The guide also will serve as a resource for retailers that are farther along the big data path and have more advanced technology requirements.

I was excited about writing this guide since I spend a lot of my time as a practicing data scientist in the fashion industry where I build machine learning solutions to enhance brand awareness.

You can download a copy of the guide HERE.

MachineLearning_book_cover_smallI’m very proud (and relieved) to announce that my year-long+ book project is finally done! “Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R” is available from Technics Publications. The book provides an introduction to the entire data science process, highlighting the ways that machine learning can be used to solve business problems. Both supervised and unsupervised statistical learning techniques are included. The R statistical programming language is used throughout. Here is the table of contents:

Introduction

Chapter 1: Machine Learning Overview

Chapter 2: Data Access

Chapter 3: Data Munging

Chapter 4: Exploratory Data Analysis

Chapter 5: Regression

Chapter 6: Classification

Chapter 7: Evaluating Model Performance

Chapter 8: Unsupervised Learning

The book is perfect for newbies just entering the data science field who wish to quickly get up to speed with the technology. I plan to use the book for the introductory courses I teach for corporations and universities. You can pre-order the book on Amazon HERE. You can find all the R code used in the book at this GitHub repo.

 

insideBIGDATA_Guide_Research_featureI’m pleased to announce that I was contracted to research, develop and write a new technology guide “insideBIGDATA Guide to Scientific Research” sponsored by Dell and Intel. The goal for this Guide is to provide a road map for scientific researchers wishing to capitalize on the rapid growth of big data technology for collecting, transforming, analyzing, and visualizing large scientific data sets.

I was particularly excited about writing this guide since, in a previous life, I was a researcher in the data analysis effort for a large-scale astrophysics project.

You can download a copy of the guide HERE.

SCE_logoThis week I started a boot camp style corporate training gig over at Southern California Edison in Irwindale. The title of the 7 week course is “Introduction to R Programming,” although I’m teaching it like an intro to data science class. The contract was arranged through UC Irvine as part of their popular data science certificate program.

The SCE group attending the class is from a broad spectrum of SCE departments including IT, business intelligence, customer analytics, power supply, and business analysis. The participants see very capable and are anxious to move into the data science realm. I’m quite pleased to take some time off my busy project work schedule for a serious teaching assignment like this. Very rewarding!

SCE_class

Economist_logoI was recently recruited by the Economist, “Look Ahead powered by GE” to write an article on the rise of data lake technology to further enable the industrial Internet. You can check out my non-bylined piece HERE.