On this page, we've outlined some additional insights we had while exploring and analyzing the dataset. We tried to answer the questions Grameen America had, but found some limitations and difficulty. Here are some of the additional insights we have, on "Gentrification" and "Swipe vs Cash Out" data.
"Gentrification"
"How might gentrification be affecting our members?" "How might leading indicators of gentrification affect decisions on where to start new branches?"
"Gentrification" is a very tricky term, as it is not easily defined and
can have different meanings and implications.
Instead of defining "gentrification" like we did for "retention," we decided to focus on specific features that
may or may not be related to gentrification, that are more indicative (such as median rent, income, etc).
We wanted to focus on the changes over time, and referenced external sources like
the American Community Survey or the US Census Bureau Data.
For analysis, we looked into features that could be indicative of gentrification,
such as home value, household income, population, and education levels
by zip code, and tried to track the changes over time, from 2015 - 2018.
We tried to find trends through feature importance, using XGBoost and Random Forest.
However, our feature importance analysis yielded poor accuracy (less than 75%) for retention Retention Metric 2 and Retention Metric 3.
Although Retentino Metric 1 had much better accuracy, there seemed to be no significant trends regarding features for "gentrification" and Retention Metric 1.
Here are some of the graphs showing our results. For both of these categories, the "yes" retention and "no" retention groups had similar trends, showing no distinctive traits between the two groups.
These results doesn't necessarily mean that there are no corerlations between "gentrification" and the Grameen America members. There could be factors related to gentrification that influence Grameen America members, but it was not possible to find such trends with the limited dataset available. For example, for the Census data, only 3 years worth of data were available to be used. There could have been trends if the data was available for a longer period of time.