Class 4 - Hybrid LCA
Contents
Class 4 - Hybrid LCA#
In this class, we will learn about supply use tables, and input output tables. We will also do a toy hybrid LCA.
Before getting started, make sure you have upgrade the Brightway2 packages. You should have at least the following:
import bw2data, bw2calc, bw2io
print("BW2 data:", bw2data.__version__)
print("BW2 calc:", bw2calc.__version__)
print("BW2 io:", bw2io.__version__)
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Input In [1], in <cell line: 1>()
----> 1 import bw2data, bw2calc, bw2io
2 print("BW2 data:", bw2data.__version__)
3 print("BW2 calc:", bw2calc.__version__)
ModuleNotFoundError: No module named 'bw2data'
Now import the necessary libraries:
from brightway2 import *
from bw2io.importers.exiobase import Exiobase22Importer
import numpy as np
import os
import pyprind
Create a new project for this class:
if 'Class 4' not in projects:
projects.current = "Class 1"
projects.copy_project("Class 4")
projects.current = "Class 4"
We will need the latest version of the data migrations to match EXIOBASE biosphere flows to ecoinvent biosphere flows:
create_core_migrations()
ERROR_MSG = """Missing a data migration needed for this class.
Please make sure you hvae the latest Brightway2 libraries, and reset the notebook."""
assert 'exiobase-biosphere' in migrations, ERROR_MSG
Import EXIOBASE 2.2#
Now we need to download the industry by industry table from version 2.2 of exiobase. You can get it from the following link. Note that you will have to register an account if this is the first time you use this database: http://www.exiobase.eu/index.php/data-download/exiobase2-year-2007-full-data-set/78-mriot-ixi-fpa-coefficient-version2-2-2/file
Extract the downloaded file, and adjust the following. Windows users might need something like:
fp = "C:\\Users\\<your name>\\Downloads\\mrIOT_IxI_fpa_coefficient_version2.2.2"
fp = "/Users/cmutel/Downloads/mrIOT_IxI_fpa_coefficient_version2.2.2"
assert os.path.exists(fp), "Please adjust your filepath, the provided one doesn't work"
We can now import the exiobase database. This will take a while, so go ahead and get started.
Why is this so slow compared to ecoinvent, for example? The answer lies in the density of the technosphere matrix. Exiobase, and IO tables in general, use comprehensive data from surveys and national customs, so they will get data on things that normal people would never even think of. For example, how much rice from Thailand is required to produce one euro of steel in Germany?
In other words, the technosphere matrix is very dense. Ecoinvent is stored as a sparse matrix, where data is only provided in about 1.5% of all possible locations - every other value is zero, and these zeros are not stored, only implied. However, the IO table has a fill rate of about 50%, meaning that we store every value in the matrix. The technosphere in ecoinvent 2.2 is about 4000 by 4000, but we only need to store about 40.000 numbers. The technosphere matrix is exiobase is about 8000 by 8000, but we store around 35.000.000 numbers.
We use a special backend for IO databases, as our standard storage mechanisms simply fall apart with such large data sets. You can see this backend here.
ex = Exiobase22Importer(fp)
ex.apply_strategies()
ex.write_database()
Creating database EXIOBASE 2.2
Extracting metadata
Extracting emissions
Extracting resources
Extracting main IO table
Extracted 163 datasets and many exchanges in 1.41 seconds
Aggregating `substances` and `extractions`
Processing technosphere
Processing exchanges
Writing activities to SQLite3 database:
0% 100%
[##############################] | ETA[sec]: 0.000
Total time elapsed: 1.234 sec
Title: Writing activities to SQLite3 database:
Started: 11/04/2015 05:42:54
Finished: 11/04/2015 05:42:56
Total time elapsed: 1.234 sec
CPU %: 89.000000
Memory %: 1.973295
Starting IO table write
Writing geomapping
Creating array - this will take a while...
On exchange number 1000000
On exchange number 2000000
On exchange number 3000000
On exchange number 4000000
On exchange number 5000000
On exchange number 6000000
On exchange number 7000000
On exchange number 8000000
On exchange number 9000000
On exchange number 10000000
On exchange number 11000000
On exchange number 12000000
On exchange number 13000000
On exchange number 14000000
On exchange number 15000000
On exchange number 16000000
On exchange number 17000000
On exchange number 18000000
On exchange number 19000000
On exchange number 20000000
On exchange number 21000000
On exchange number 22000000
On exchange number 23000000
On exchange number 24000000
On exchange number 25000000
On exchange number 26000000
On exchange number 27000000
On exchange number 28000000
On exchange number 29000000
On exchange number 30000000
On exchange number 31000000
On exchange number 32000000
On exchange number 33000000
On exchange number 34000000
On exchange number 35000000
Trimming array
Writing array
Free up some memory
ex = None
LCA calculations#
We can now do an LCA. We first do this the standard way:
gwp = ('IPCC 2013', 'climate change', 'GWP 100a')
lca = LCA({Database("EXIOBASE 2.2").random(): 1}, method=gwp)
lca.lci()
lca.lcia()
Our technosphere matrix is sparse:
lca.technosphere_matrix
<7824x7824 sparse matrix of type '<class 'numpy.float64'>'
with 35550884 stored elements in Compressed Sparse Row format>
And it takes a while to solve (versus less than one second for ecoinvent 2.2):
%timeit lca.solve_linear_system()
1 loops, best of 3: 13.6 s per loop
Free up some memory by forgetting about the lca
object.
lca = None
However, we have a special LCA class that only does dense technosphere matrices. If we use it, we will get better performance, because the linear solver assumes dense instead of sparse matrices:
dlca = DenseLCA({Database("EXIOBASE 2.2").random(): 1}, method=gwp)
dlca.lci()
The technosphere is, as you would expect, now a dense matrix
type(dlca.technosphere_matrix)
scipy.sparse.csr.csr_matrix
The nupy dense solver of linear system is faster than the SciPy/UMFPACK sparse solver, as our matrix actually is quite dense. The performance should be much better:
%timeit dlca.solve_linear_system()
1 loops, best of 3: 7.58 s per loop
Free up some more memory by forgetting about the tech_params
array.
print(dlca.tech_params.shape)
dlca.tech_params = None
(35557095,)
Create aggregated processes#
We can now create aggregated (so-called “system”) processes for each activity in Exiobase. These aggregated proceses can be used in our normal sparse LCAs, but are terminated, i.e. we can’t understand their background supply chains.
First, we create a new database.
aggregated_db = Database("EXIOBASE 2.2 aggregated")
This is a normal database, not an IOTable
database.
type(aggregated_db)
bw2data.backends.peewee.database.SQLiteBackend
Now, we invert the EXIOBASE technosphere matrix.
This takes some minutes - around 4 on my laptop - so just be patient. It is helpful if there is plenty of free memory.
inverse = np.linalg.pinv(dlca.technosphere_matrix.todense())
With the inverse, we can calculated the aggregated inventories, and then write each aggregated process.
inventory = dlca.biosphere_matrix * inverse
print(inventory.shape)
(36, 7824)
Define the activity data fields we want to keep
KEYS = (
'exiobase_code',
'group',
'group_name',
'location',
'name',
'synonym',
'type',
'unit'
)
data = {}
Only take each non-zero biosphere flow, and create the aggregated processes.
for ds in pyprind.prog_bar(Database("EXIOBASE 2.2")):
col = dlca.activity_dict[ds.key]
# Basic data
data[("EXIOBASE 2.2 aggregated", ds['code'])] = {key: ds[key] for key in KEYS}
# Exchanges
data[("EXIOBASE 2.2 aggregated", ds['code'])]['exchanges'] = [{
'type': 'biosphere',
'amount': float(inventory[row, col]),
'input': flow,
'uncertainty type': 0
} for flow, row in dlca.biosphere_dict.items() if inventory[row, col]]
0% 100%
[##############################] | ETA[sec]: 0.000
Total time elapsed: 2.755 sec
aggregated_db.write(data)
Writing activities to SQLite3 database:
0% 100%
[##############################] | ETA[sec]: 0.000
Total time elapsed: 33.328 sec
Title: Writing activities to SQLite3 database:
Started: 11/04/2015 06:13:16
Finished: 11/04/2015 06:13:49
Total time elapsed: 33.328 sec
CPU %: 84.800000
Memory %: 11.791039
We no longer need the dlca object, so we can forget about it to save some memory.
dlca = None
Sample LCA calculations#
We will look at two product systems selected in class. We found the dataset keys using code like:
for x in Database("ecoinvent 2.2").search('fertili*'):
print(x, x.key)
Cement production#
ex_cement = ('EXIOBASE 2.2 aggregated', 'Manufacture of cement, lime and plaster:CH')
ei_cement = ('ecoinvent 2.2', 'c2ff6ffd532415eda3eaf957b17b70a1')
Check to make sure we have the correct activities
get_activity(ex_cement)
'Manufacture of cement, lime and plaster' (million €, Switzerland, None)
get_activity(ei_cement)
'cement, unspecified, at plant' (kilogram, CH, ['construction materials', 'binder'])
lca = LCA({ex_cement: 1}, gwp)
lca.lci()
lca.lcia()
print("Exiobase:", lca.score / 1e6 / 10) # Assume 100 euros/ton
lca = LCA({ei_cement: 1}, gwp)
lca.lci()
lca.lcia()
print("Ecoinvent", lca.score)
Exiobase: 0.6778455551516355
Ecoinvent 0.7391718051829614
Warning: (almost) singular matrix! (estimated cond. number: 1.97e+12)
These numbers are remarkably similar.
Nitrogenous fertilizer#
Let’s now look at nitrogen fertilizer:
ei_n = ('ecoinvent 2.2', '920a20d9a87340557a31ee7e8a353d3c')
ex_n = ('EXIOBASE 2.2 aggregated', 'N-fertiliser:LU')
Check to make sure we have the correct activities
get_activity(ei_n)
'potassium nitrate, as N, at regional storehouse' (kilogram, RER, ['agricultural means of production', 'mineral fertiliser'])
get_activity(ex_n)
'N-fertiliser' (million €, Luxembourg, None)
lca = LCA({ex_n: 1}, gwp)
lca.lci()
lca.lcia()
print("Exiobase:", lca.score / 1e6 * 0.8) # Assume 800 euros/ton
lca = LCA({ei_n: 1}, gwp)
lca.lci()
lca.lcia()
print("Ecoinvent:", lca.score)
Exiobase: 0.005093872326115799
Ecoinvent: 15.451009908674084
Warning: (almost) singular matrix! (estimated cond. number: 1.97e+12)
This is quite interesting - more investigation would have to be done to understand why these values are so different.
Cleaning up#
This project consumes a lot of hard drive space, about 2 gigabytes. We can get the exact size of this and all other projects (in gigabytes) with the following:
projects.report()
[('3.2', 1, 0.04101514),
('CAES', 3, 0.689499258),
('Class 1', 2, 0.191735801),
('Class 2', 2, 0.191735801),
('Class 3', 3, 0.233215194),
('Class 4', 4, 1.956078295),
('US LCI', 1, 0.024052396),
('backcalculate', 2, 0.68859466),
('databases demo', 1, 2.6909e-05),
('default', 2, 0.390822066),
('econometrics', 0, 0.035075556),
('forwast', 2, 0.053857285),
('lc-impact', 2, 1.689167237),
('parameterized', 1, 0.022645899),
('temporalis', 23, 0.819151974)]
We can then delete the current project.
This step is optional, included as a convenience for those who do not want to work with Exiobase.
projects.delete_project(delete_dir=True)
'default'
The returned value is the name of the current project.
projects.current
'default'