The goal of bcsa
is to provide datasets for source apportionment of light absorbing carbon (LAC) in Blantyre, Malawi. This package combines datasets collected as part of two projects. The first project is on determining Absorption Angstrom Exponent (AAE) values of local pollution sources in Blantyre, Malawi. AAE values can be used to differentiate the LAC from fossil fuel and biomass based sources. The second project is to determine the light absorbing carbon concentrations by mobile, personal and stationary monitoring in Blantyre.
The package includes the following seven datasets:
-
df_aae
: Data of experiments to determine AAE values -
df_mm
: Mobile monitoring data in eight settlements -
df_mm_road_type
: Mobile monitoring data classified by highways (main_road) and non-highways (non_main_roads) in eight settlements -
df_pm
: Personal monitoring data in four settlements -
df_pm_trips
: Data on times when open waste burning was observed during the personal monitoring -
df_sm
: Raw data from stationary monitoring in two settlements -
df_collocation
: Data when the two LAC monitors are placed and run next to each other to check data quality
This study used the MA200 micro-aethalometer to measure the light absorbing carbon (LAC) concentrations. The MA200 measures the LAC concentrations in real-time at five different wavelengths, that allows for source apportionment.
Installation
You can install the development version of bcsa from GitHub with:
# install.packages("devtools")
devtools::install_github("Global-Health-Engineering/bcsa")
Alternatively, you can download the individual datasets as a CSV or XLSX file from the table below.
dataset | CSV | XLSX |
---|---|---|
df_aae | Download CSV | Download XLSX |
df_collocation | Download CSV | Download XLSX |
df_mm | Download CSV | Download XLSX |
df_mm_road_type | Download CSV | Download XLSX |
df_pm | Download CSV | Download XLSX |
df_pm_trips | Download CSV | Download XLSX |
df_sm | Download CSV | Download XLSX |
Datasets
This data package has seven datasets: df_aae
df_mm
df_mm_road_type
df_pm
df_pm_trips
df_sm
df_collocation
df_aae
This dataset contains data from experiments to determine AAE values of local pollution sources in Blantyre, Malawi.
- Vehicular emission
- Waste Burning (Plastics)
- Waste Burning (Plastic-based textiles, e.g., polyester)
- Waste Burning (Garden Waste)
- Waste Burning (Cardboard and Paper)
- Mixed waste burning
- Cooking (Using Solid Biofuels - Wood, Charcoal, Briquettes)
All the data mentioned above were collected over 17 days, from 16th May to 1st June, 2023.
Vehicular emissions: three diesel pick-up trucks were sampled. The exhaust was monitored by the MA200, positioned approximately 3 meters from the vehicle’s exhaust, after the vehicle engine was started. Monitoring was conducted for 20 minutes. Also, mobile monitoring was carried out on three heavily trafficked roads during peak traffic hours, with a duration of 20 minutes on each road.
Open waste burning emissions (individual components burning): various waste components (plastics, textiles, cardboard and paper, wood, and leaves) were burned in a semi-open guard shelter. The shelter, covered on three sides and open on one side, allowed for burning at the edge of the open side. The MA200 monitor was positioned at a 3-meter distance from the open side, with the micro-cyclone attached to MA200 placed 1.5 meters above the ground. The amount of waste components burned was carefully determined through trial and error, ensuring concentrations remained within desired levels while burning for a sufficient duration. A consistent 20-minute burning duration was chosen to align with vehicle monitoring. Each waste type was burned in three replicates, with a minimum half-an-hour gap between each experiment, during which the shelter was ventilated.
Mixed waste burning: six known mixed waste dumps across the city were monitored, each for a 20-minute duration.
Cooking emissions: the MA200 was placed at a guardian shelter in Queens, known for using firewood for cooking. The micro-cyclone was positioned 1.5 meters above the ground and 3 meters from the front windows section of the shelter. Sampling was conducted for 20 minutes on three days during peak cooking times.
library(bcsa)
The df_aae
data set has 21 variables and 1199 observations. For an overview of the variable names, see the following table.
df_aae |>
head() |>
gt::gt() |>
gt::as_raw_html()
serial_number | session_id | date | time | lat | long | uv_bcc | blue_bcc | ir_bcc | uv_babs | blue_babs | ir_babs | date_time | day_type | id | date_start | date_end | start_time | end_time | exp_type | emission_source |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
variable_name | variable_type | description |
---|---|---|
serial_number | character | serial number of the MA200 monitoring from which the data is collected |
session_id | double | session number of the MA200 monitor (each monitoring session is automatically given a number in the output file of MA200 monitoring data) |
date | double | date of monitoring |
time | double | time of monitoring |
lat | double | latitude of location of monitoring |
long | double | longitude of location of monitoring |
uv_bcc | double | concentration of black carbon at the UV (ultravoilet) wavelength channel in ng/m3 |
blue_bcc | double | concentration of black carbon at the Blue wavelength channel in ng/m3 |
ir_bcc | double | concentration of black carbon at the IR (infrared) wavelength channel in ng/m3 |
uv_babs | double | absorption coefficient of black carbon at the UV (ultravoilet) wavelength channel |
blue_babs | double | absorption coefficient of black carbon at the Blue wavelength channel |
ir_babs | double | absorption coefficient of black carbon at the IR (infrared) wavelength channel |
date_time | double | date and time of monitoring |
day_type | character | type of day of monitoring - weekend or weeday |
id | double | id is a unique identifier given to every monitoring session and experiment given in the data structure |
date_start | double | starting date of the experiment |
date_end | double | end date of the experiment |
start_time | double | starting time of the experiment |
end_time | double | end time of the experiment |
exp_type | character | type of experiment |
emission_source | character | source of emission |
df_mm
Mobile monitoring data in eight settlements in Blantyre, Malawi. The data was collected from eight settlements (four planned and four unplanned settlements.
Mobile monitoring utilised a vehicle equipped with a portable MA200 instrument. The micro-cyclone was positioned outside the car’s front window at an elevation of 1.5 meters above the ground. The vehicle was driven at a speed of less than 20 km/h.
The monitoring took place from 3rd May to 14th May, 2023, covering both weekdays and weekends.
The df_mm
data set has 23 variables and 3956 observations. For an overview of the variable names, see the following table.
df_mm |>
head() |>
gt::gt() |>
gt::as_raw_html()
serial_number | session_id | date | time | lat | long | uv_bcc | blue_bcc | ir_bcc | uv_babs | blue_babs | ir_babs | date_time | day_type | id | date_start | date_end | start_time | end_time | exp_type | settlement_id | time_of_day | type_of_settlement |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
variable_name | variable_type | description |
---|---|---|
serial_number | character | serial number of the MA200 monitoring from which the data is collected |
session_id | double | session number of the MA200 monitor (each monitoring session is automatically given a number in the output file of MA200 monitoring data) |
date | double | date of monitoring |
time | double | time of monitoring |
lat | double | latitude of location of monitoring |
long | double | longitude of location of monitoring |
uv_bcc | double | concentration of black carbon at the UV (ultravoilet) wavelength channel in ng/m3 |
blue_bcc | double | concentration of black carbon at the Blue wavelength channel in ng/m3 |
ir_bcc | double | concentration of black carbon at the IR (infrared) wavelength channel in ng/m3 |
uv_babs | double | absorption coefficient of black carbon at the UV (ultravoilet) wavelength channel |
blue_babs | double | absorption coefficient of black carbon at the Blue wavelength channel |
ir_babs | double | absorption coefficient of black carbon at the IR (infrared) wavelength channel |
date_time | double | date and time of monitoring |
day_type | character | type of day of monitoring - weekend or weeday |
id | double | id is a unique identifier given to every monitoring session and experiment given in the data structure |
date_start | double | starting date of the experiment |
date_end | double | end date of the experiment |
exp_type | character | type of experiment |
settlement_id | character | settlement name at which the monitoring was conducted |
time_of_day | character | the days are divided into three types - morning, first half and second half |
type_of_settlement | character | type of settlement - formal or informal |
start_time | double | starting time of the experiment |
end_time | double | ending time of the experiment |
df_mm_road_type
This dataset is collected during mobile monitoring and is the same as df_mm
, except that the highways and non-highways are demarcated in this dataset.
The df_mm_road_type
data set has 25 variables and 3540 observations. For an overview of the variable names, see the following table.
df_mm_road_type |>
head() |>
gt::gt() |>
gt::as_raw_html()
serial_number | session_id | date | time | lat | long | uv_bcc | blue_bcc | ir_bcc | uv_babs | blue_babs | ir_babs | date_time | day_type | id | date_start | date_end | exp_type | id_road_type | settlement_id | time_of_day | type_of_settlement | start_time | end_time | type_of_road |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
variable_name | variable_type | description |
---|---|---|
serial_number | character | serial number of the MA200 monitoring from which the data is collected |
session_id | double | session number of the MA200 monitor (each monitoring session is automatically given a number in the output file of MA200 monitoring data) |
date | double | date of monitoring |
time | double | time of monitoring |
lat | double | latitude of location of monitoring |
long | double | longitude of location of monitoring |
uv_bcc | double | concentration of black carbon at the UV (ultravoilet) wavelength channel in ng/m3 |
blue_bcc | double | concentration of black carbon at the Blue wavelength channel in ng/m3 |
ir_bcc | double | concentration of black carbon at the IR (infrared) wavelength channel in ng/m3 |
uv_babs | double | absorption coefficient of black carbon at the UV (ultravoilet) wavelength channel |
blue_babs | double | absorption coefficient of black carbon at the Blue wavelength channel |
ir_babs | double | absorption coefficient of black carbon at the IR (infrared) wavelength channel |
date_time | double | date and time of monitoring |
day_type | character | type of day of monitoring - weekend or weeday |
id | double | id is a unique identifier given to every monitoring session and experiment given in the data structure |
date_start | double | starting date of the experiment |
date_end | double | end date of the experiment |
start_time | double | starting time of the experiment |
end_time | double | end time of the experiment |
exp_type | character | type of experiment |
settlement_id | character | settlement name at which the monitoring was conducted |
time_of_day | character | the days are divided into three types - morning, first half and second half |
type_of_settlement | character | type of settlement - formal or informal |
id_road_type | double | a unique id given to different roads sections |
type_of_road | character | roads are categorised as highways and non-highways |
df_pm
Personal monitoring data in four unplanned settlements in Blantyre, Malawi, covering the areas inaccesible by vehicles due to narrow and undefined unpaved roads.
For personal mobile monitoring, an individual carried the monitoring equipment and conducted on-foot surveys within the informal settlements. The micro-cyclone was attached at the collar of the person.
This method was implemented twice in each settlement, from 9:00 a.m. to 11:30 a.m. and from 2:00 p.m. to 4:30 p.m., between the 19th to 25th of May, 2023, covering weekdays.
The df_pm
data set has 23 variables and 1108 observations. For an overview of the variable names, see the following table.
df_pm |>
head() |>
gt::gt() |>
gt::as_raw_html()
serial_number | session_id | date | time | lat | long | uv_bcc | blue_bcc | ir_bcc | uv_babs | blue_babs | ir_babs | date_time | day_type | id | date_start | date_end | start_time | end_time | exp_type | settlement_id | time_of_day | type_of_settlement |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
variable_name | variable_type | description |
---|---|---|
serial_number | character | serial number of the MA200 monitoring from which the data is collected |
session_id | double | session number of the MA200 monitor (each monitoring session is automatically given a number in the output file of MA200 monitoring data) |
date | double | date of monitoring |
time | double | time of monitoring |
lat | double | latitude of location of monitoring |
long | double | longitude of location of monitoring |
uv_bcc | double | concentration of black carbon at the UV (ultravoilet) wavelength channel in ng/m3 |
blue_bcc | double | concentration of black carbon at the Blue wavelength channel in ng/m3 |
ir_bcc | double | concentration of black carbon at the IR (infrared) wavelength channel in ng/m3 |
uv_babs | double | absorption coefficient of black carbon at the UV (ultravoilet) wavelength channel |
blue_babs | double | absorption coefficient of black carbon at the Blue wavelength channel |
ir_babs | double | absorption coefficient of black carbon at the IR (infrared) wavelength channel |
date_time | double | date and time of monitoring |
day_type | character | type of day of monitoring - weekend or weeday |
id | double | id is a unique identifier given to every monitoring session and experiment given in the data structure |
date_start | double | starting date of the experiment |
date_end | double | end date of the experiment |
start_time | double | starting time of the experiment |
end_time | double | end time of the experiment |
exp_type | character | type of experiment |
settlement_id | character | settlement name at which the monitoring was conducted |
time_of_day | character | the days are divided into three types - morning, first half and second half |
type_of_settlement | character | type of settlement - formal or informal |
df_pm_trips
While conducting the personal monitoring, the individual also recorded the times when the open waste burning. The times when the individual observed the burning events is given in this dataset.
The df_pm_trips
data set has 11 variables and 81 observations. For an overview of the variable names, see the following table.
df_pm_trips |>
head() |>
gt::gt() |>
gt::as_raw_html()
id | serial_number | session_id | date_start | date_end | start_time | end_time | exp_type | event | time | settlement_id |
---|---|---|---|---|---|---|---|---|---|---|
variable_name | variable_type | description |
---|---|---|
serial_number | character | serial number of the MA200 monitoring from which the data is collected |
session_id | double | session number of the MA200 monitor (each monitoring session is automatically given a number in the output file of MA200 monitoring data) |
date | double | date of monitoring |
time | double | time when a burning event was observed |
id | double | id is a unique identifier given to every monitoring session and experiment given in the data structure |
date_start | double | starting date of the experiment |
date_end | double | ending date of the experiment |
start_time | double | starting time of the experiment |
end_time | double | ending time of the experiment |
exp_type | character | type of experiment |
settlement_id | character | settlement name at which the monitoring was conducted |
df_sm
To capture the ambient concentration and diurnal pattern of LAC, stationary monitoring was conducted in two specific areas, one planned and one unplanned. The monitoring campaign spanned from 13th July to 22nd August during the winter season.
The df_sm
data set has 21 variables and 20756 observations. For an overview of the variable names, see the following table.
df_sm |>
head() |>
gt::gt() |>
gt::as_raw_html()
serial_number | session_id | date | time | lat | long | uv_bcc | blue_bcc | ir_bcc | uv_babs | blue_babs | ir_babs | date_time | day_type | id | date_start | date_end | start_time | end_time | exp_type | settlement_id |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
variable_name | variable_type | description |
---|---|---|
serial_number | character | serial number of the MA200 monitoring from which the data is collected |
session_id | double | session number of the MA200 monitor (each monitoring session is automatically given a number in the output file of MA200 monitoring data) |
date | double | date of monitoring |
time | double | time of monitoring |
lat | double | latitude of location of monitoring |
long | double | longitude of location of monitoring |
uv_bcc | double | concentration of black carbon at the UV (ultravoilet) wavelength channel in ng/m3 |
blue_bcc | double | concentration of black carbon at the Blue wavelength channel in ng/m3 |
ir_bcc | double | concentration of black carbon at the IR (infrared) wavelength channel in ng/m3 |
uv_babs | double | absorption coefficient of black carbon at the UV (ultravoilet) wavelength channel |
blue_babs | double | absorption coefficient of black carbon at the Blue wavelength channel |
ir_babs | double | absorption coefficient of black carbon at the IR (infrared) wavelength channel |
date_time | double | date and time of monitoring |
day_type | character | type of day of monitoring - weekend or weeday |
id | double | id is a unique identifier given to every monitoring session and experiment given in the data structure |
date_start | double | starting date of the experiment |
date_end | double | end date of the experiment |
start_time | double | starting time of the experiment |
end_time | double | end time of the experiment |
exp_type | character | type of experiment |
settlement_id | character | settlement name at which the monitoring was conducted |
df_collocation
For quality assurance, the two sensors were collocated multiple times throughout the study: twice during mobile monitoring, twice during AAE experiments, and before and after stationary monitoring.
The df_collocation
data set has 21 variables and 6037 observations. For an overview of the variable names, see the following table.
df_collocation |>
head() |>
gt::gt() |>
gt::as_raw_html()
serial_number | session_id | date | time | lat | long | uv_bcc | blue_bcc | ir_bcc | uv_babs | blue_babs | ir_babs | date_time | day_type | id | date_start | date_end | start_time | end_time | exp_type | main_exp |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
variable_name | variable_type | description |
---|---|---|
serial_number | character | serial number of the MA200 monitoring from which the data is collected |
session_id | double | session number of the MA200 monitor (each monitoring session is automatically given a number in the output file of MA200 monitoring data) |
date | double | date of monitoring |
time | double | time of monitoring |
lat | double | latitude of location of monitoring |
long | double | longitude of location of monitoring |
uv_bcc | double | concentration of black carbon at the UV (ultravoilet) wavelength channel in ng/m3 |
blue_bcc | double | concentration of black carbon at the Blue wavelength channel in ng/m3 |
ir_bcc | double | concentration of black carbon at the IR (infrared) wavelength channel in ng/m3 |
uv_babs | double | absorption coefficient of black carbon at the UV (ultravoilet) wavelength channel |
blue_babs | double | absorption coefficient of black carbon at the Blue wavelength channel |
ir_babs | double | absorption coefficient of black carbon at the IR (infrared) wavelength channel |
date_time | double | date and time of monitoring |
day_type | character | type of day of monitoring - weekend or weeday |
id | double | id is a unique identifier given to every monitoring session and experiment given in the data structure |
date_start | double | starting date of the experiment |
date_end | double | end date of the experiment |
start_time | double | starting time of the experiment |
end_time | double | end time of the experiment |
exp_type | character | type of experiment |
main_exp | character | the experiment phase during which the sensors were collocated |
License
Data are available as CC-BY.
Citation
Please cite this package using:
citation("bcsa")
#> To cite package 'bcsa' in publications use:
#>
#> Vijay S, Chilunga H, Khonje L, Kamjombo J, Tilley E, Schöbitz L
#> (2024). "bcsa: What the Package Does (One Line, Title Case)."
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Misc{vijaychilunga,
#> title = {bcsa: What the Package Does (One Line, Title Case)},
#> author = {Saloni Vijay and Hope Kelvin Chilunga and Lennox Khonje and Jack Kajombo and Elizabeth Tilley and Lars Schöbitz},
#> year = {2024},
#> abstract = {The bcsa package provide datasets for source apportionment of light absorbing carbon (LAC) in Blantyre, Malawi. The package contains data on Absorption Angstrom Exponent experiments determination of local pollution sources. The package also contains data on spatial distribution and ambient concentrations of LAC concentrations. This study used the MA200 micro-aethalometer to measure the LAC concentrations. The MA200 measures the LAC concentrations in real-time at five different wavelengths, that allows for source apportionment.},
#> version = {0.0.0.9000},
#> }