|Jennings Anderson||Austin Arrington||Chris Cacciapaglia|
|Etienne Fluet-Chouinard||Krishna Karthik Gadiraju||Anuj Karpatne|
|Jia Li||Jiangxiao Qiu||Ishwarya Rajendrababu|
|Solomon Vimal||Lynn Waterhouse|
Big Human Data: For Humans, by Humans
Proposed here is the notion of “big human data” analysis. As today’s “big data” mostly consists of human generated data such as social media or volunteered geographic information (VGI), our analysis techniques must evolve to account for the socio-behavioral aspects of the data production. I present this argument and a new tool for localized analysis of the massive OpenStreetMap editing history which aims to better understand the VGI practices surrounding the rapid digital convergence of online mappers in the wake of a disaster.
With a background in Mathematics and Computer Science, I discovered that the most interesting questions are not about computers themselves, but rather how we as people use and interact with these devices. As such, I am currently interested in the analysis of volunteered geographic information contributed to OpenStreetMap in the wake of a disaster, as well as learning about human behavior in disaster through geo-spatial analysis of big social media data.
|Human Computer Interaction|
Environmental Science / Ecosystem Restoration
Color Analysis of Crowdsourced Images for Ecological Monitoring
Remote sensing technology, such as satellite imagery, is a powerful tool for studying spatial ecology. However, understanding spatial ecology often requires finer scales than is afforded by satellite imagery, and the need for “ground-truthing” still exists. Leveraging “Big Data,” or more specifically, geo-tagged and time-stamped images provided through open source online networks, may offer a solution to help better understand scale and pattern in ecological systems.
I have a background working in web and app development and am interested in leveraging tools such as GIS, mobile technology, and data mining to help humans adapt to climate change, ecosystem degradation, and related ecological problems. Currently I am working on a MS in Environmental Science / Ecosystem Restoration at SUNY - ESF. My thesis will focus on potential ecosystem impacts of the pending Nicaragua Canal in the context of climate change. My research methods will involve GIS analysis of MODIS project evapotranspiration data in Nicaragua. Another primary research interest of mine is thinking of ways to leverage mobile technology and open source software / web applications for environmental problem solving. In particular, I'm working on data scraping methods to find recognizable patterns of climate-induced change to the seasonal growth cycles of different plant species. For example, by data scraping a particular patch of vegetation found repeatedly in geo-tagged photos posted on Flickr, we might be able to reveal robust insights into the nature and degree of the impact of climate change on plant phenology and interrelated ecological issues. These images can then be integrated with remote sensing data (such as MODIS imagery) to produce an interface that could operate as an ecological research database, collecting and organizing various scales for a particular area of interest. In general, I am interested in the potential of combining citizen science and professional research, powered by cloud and mobile technologies, in order to gain new environmental insights.
|Geographic Information Systems||Ecosystem Restoration||Open Source Citizen Science|
Climate change refuges in the oceans
Identify coral reef refugia in the Pacific, Indian and Atlantic Oceans under differing climate change scenarios using climate-envelope models in accordance with high-resolution environmental data at a global scale
3rd year PhD student in Florida. My research deals with geospatial modeling ecological changes of coral reefs in accordance to climate change.
|geospatial ecology||climate change||statistics and probabilities|
Center for Limnology
Estimating global inland fish catch from case study extrapolation.
The unreliability of nationally reported statistics on inland fish catch leaves a large uncertainty around the status of the inland fisheries, and limits responsible management of the fish resource. Alternative approaches for inventorying the status and trend of inland fisheries must be explored, however, by their simplicity, current existing yield models at the global scale do not provide a credible alternative to nationally reported statistics. I propose to generate global predictions of inland fish yield from a machine learning approach. The proposed approach presents the challenges of data integration from multiple sources including large data layers, such as a high-resolution wetland map to distinguishing water bodies types, as well as of adequate model selection among machine learning methods. This analysis will provide a new point of comparison for assessing the quality of national reporting and FAO’s own confidence level in reporting.
I am a 3rd year PhD student with a background in geosciences. My doctoral research broadly revolves around the impacts and interactions of humans with freshwater ecosystems at the global scale, in particularly river network fragmentation, wetlands conversion, conservation portfolio and fisheries.
|Inland Fisheries||Global Hydrology||Spatial Modeling|
Detecting Extreme Events in Global Gridded Climate Data using Gaussian Processes
Extreme weather events refer to events such as droughts, heat waves, floods, cyclones, wildfires etc. Several extreme weather events, such as the heat waves in Europe in 2003 and Russia in 2010, the California Drought etc., have increased in frequency in the past few decades. In order to be able to predict these extreme events, and understand their impact on the economy, there is an increasing need to study these events. Climate and weather data are spatiotemporal in nature; therefore the methods to identify these events should take into account the spatial and temporal autocorrelations. Techniques to find extreme events are often called as anomaly and/or outlier detection. Gaussian Process (GP) learning is one such method, which can model spatial and temporal correlations.In this paper, we briefly explain how GP can be applied for anomaly detection, and the problems faced by GP when dealing with big climate data.
I'm a member of the STAC (Spatio-Temporal Analytics and Computing) Lab at North Carolina State University, headed by Dr. Ranga Raju Vatsavai. I'm primarily interested in large scale analytics of spatial and spatio-temporal data. I'm currently working on anomaly detection in climate data.
|Spatio Temporal Data Mining||Large Scale Data Analytics||Large Scale Data Management|
Computer Science and Engineering
Global Monitoring of Inland Water Dynamics: A Data-driven Approach
Freshwater, which is only available in inland water bodies such as lakes, reservoirs, and rivers, is increasingly becoming scarce across the world and this scarcity is posing a global threat to human sustainability. A global monitoring of inland water bodies is necessary for policy-makers and the scientific community to address this problem. The promise of data-driven approaches coupled with availability of remote sensing data presents opportunities as well as challenges for global monitoring of inland water bodies. My research aims at developing predictive models that address the challenges in analyzing remote sensing data for creating the first global monitoring system of inland water dynamics.
I am a PhD candidate in the Department of Computer Science and Engineering at the University of Minnesota. I am working with my advisor Prof. Vipin Kumar on developing data-driven approaches for applications in Earth system monitoring. My research interests involve spatio-temporal data mining, heterogeneous machine learning, and spatial statistics applications.
|Data Mining||Machine Learning||Earth Science Applications|
College of Science and Engineering
Spatio-Temporal Analysis of Hospital Admission Data for Improved Population Health in New York City
There has been an increased focus on Precision Medicine -- customized healthcare delivery based on individual patient information (e.g. family history, genetic information, etc.) -- highlighted by President Obama’s $215M “Precision Medicine Initiative”. While this route is promising, there are several knowledge gaps (especially in uncertainty quantification) that limit Precision Medicine’s applicability. An alternative approach to creating a healthy society is to focus on how large-scale environmental factors -- natural environment, social environment, built environment, demographics, etc. -- as opposed to individual genetic information (such as those used in Precision Medicine) can inform community or population health. This work focuses on applying a spatio-temporal analysis for hospital admissions data to understand how various environmental factors influence the access and type of care patients from different communities might need, with a goal of improving population health for communities in New York City (NYC).
As a PhD student, I was intrigued by Data Science and now I'm working with Dr. Vipin Kumar. In the recent months I devoted myself into domain data mining, specifically medical data analysis. Manhattan, NYC medical data set is my research object, thus I can analyze variant patient characteristics and distributions spatially and temporally, furthermore, we figure out methods to do predictions on the issues that health institutions care about, such as serious illness distribution pattern, hospital's catchment area, and degree of traffic difficulty for patient health care.
Department of Zoology
Assessing Future Benefits of Ecosystems: Multidimensional Spatial-temporal data
Understanding future prospects of ecosystems and the benefits they provide to society (or termed as "ecosystem service") is critical yet remains challenging in the context of unpredictable global changes. Such research is further complicated by Big Data challenges, in particular pertaining to multidimensional spatial-temporal data. This study seeks to develop approaches to analyze this large and complex dataset for detecting long-term trends, thresholds, interactions, and spatial-temporal dynamics of ecosystem services, as well as further translating understanding into decision-makings to achieve sustainability transition into the future.
I am a PhD candidate at the University of Wisconsin-Madison. For my graduate research, I am interested in how global change drivers affect our future ecosystems and the benefits they provide to our society, and in what ways we can balance human needs while building resilience of coupled human-natural systems.
|Landscape ecology||Global environmental change||Social-ecological systems|
Computer Science & Engineering
Weather Data Characterization Tool
The use of weather data in data analytics has become widely prevalent for a variety of applications, which aid in strategic business decision making for disciplines ranging from healthcare, transportation and planning, to economics, and in research investigations in the physical sciences. This has increased the demand for access to weather data in the form of daily, monthly, and annual summaries which contain values indicating the temperature (min, max, and average), precipitation, wind speed, snowfall and other parameters in record format. Consumers of the open weather dataset provided by the National Centers for Environmental Information  face a major challenge: they often lack the requisite domain knowledge to interpret the detailed meteorological data available. Hence, data scientists and other consumers of the data must write their own versions of tooling to translate the values into meaningful descriptions. The objective of this paper is to advocate for open data set publishers to supply data set interpretation tools (DSITs) to facilitate consumption of the released data. Specifically, we describe a DSIT for characterizing the weather on a particular day. The tool produces a data set, which is consumable by anyone without requiring them to possess expertise in the meteorological domain. DSITs are themselves analytics in that they are encoded representations of expert knowledge, which provide actionable insight.
I'm a graduate student at NYU SoE and along with another team member is working on a project in the field of Big Data Analytics to analyze NYC Citibike demand and "station full" scenario based on the weather.
|Big Data Analytics||Front end development|
Institute for the Environment, Department of Geography
Global Flood Risk Management
Can we build better data warehouses and geo-servers to share flood (or disaster) risk maps and related data and models for interoperability and rapid update at global scale? Can we enable seamless visualization of risk maps by overlaying them on existing map services?
I am currently working as a resident scholar in the new NOAA National Water Center at University of Alabama, Tuscaloosa. The summer project undertaken here is called the National Flood Interoperability Experiment (NFIE) and it's an attempt to produce next generation operational flood forecasting at national scale for the United States. I am also a visiting scholar affiliated to University of North Carolina, Chapel Hill. I am carrying out my thesis here currently for my European master's programme called Erasmus-Mundus Flood Risk Management. The consortium of this master's programme is of 4 universities in -- Germany, the Netherlands, Spain & Slovenia-- where since 2013 I spent a semester each place. The consortium also has leading flood risk management industrial partners - Deltares (Netherlands), DHI(Denmark), HR Wallingford(UK) and ICHARM(Japan).
Assessing small scale fisheries effort using satellite imagery data
High resolution remote sensing data provides the unique opportunity to assess small-scale fisheries, specifically fishing effort, at the global level. UCSD has been given access, through partnership with the DigitalGlobe Foundation, to a large repository of fine scale remote sensing data. We propose using a search algorithm to identify possible boats from the data and a crowd sourcing platform to verify the algorithm output.
I am a Biological Oceanography PhD student in Brice Semmens lab at Scripps Institution of Oceanography at the University of California, San Diego. My research focuses on the development of quantitative models for studying population biology and ecology, with an emphasis on fisheries. I am currently working on projects involving a stock assessment of white seabass for the state of California; estimating Nassau grouper population size at a spawning aggregation in the Cayman Islands; evaluating fisheries across the US using a variety of metrics (e.g., delay difference models); and estimating abundances of Salmonid species returning to river sections in the upper Columbia Basin. I have had the pleasure of being a Scripps Classroom Connections GK-12 Fellow working at Kearny SCT High School in a Biology classroom and am dedicated to continuing to work in education. I am interested in developing models and techniques that help assess the environment in a more efficient way. My background is in statistics and I really enjoy designing experiments, coming up with ways to analyze data, and disseminating results in ways the general public can understand and appreciate.
|quantitative ecology||population dynamics||applied statistics|