Importing the ECHE list
This application imports the ECHE list from a .xslx
file, as published by DGEAC.
Source file
The source file is taken from the Erasmus+ website, then added manually to the code base.
Importing from .xlsx
With the help of the openpyxl
library, the .xlsx
file is loaded as a workbook with data values, meaning that all formulas are parsed and only the end results are kept.
The first worksheet in the workbook is considered to be the active one.
Headers and values
The first row of the active worksheet is used to retrieve the column headers. These headers should match the expected values as seen in the list below, so that they can be correctly mapped to the API keys.
- Proposal Number (
proposalNumber
) - Erasmus Code (
erasmusCode
) - PIC (
pic
) - OID (
oid
) - Organisation Legal Name (
organisationLegalName
) - Street (
street
) - Postal Code (
postalCode
) - City (
city
) - Country (
country
) - Webpage (
webpage
) - ECHE Start Date (
echeStartDate
) - ECHE End Date (
echeEndDate
)
All remaining rows are taken as the data values.
With the help of the pandas
library, the headers and data values are used to produce a DataFrame. All empty rows and columns are dropped.
Cleaning whitespace
The new DataFrame undergoes some basic data cleaning, where preceding and trailing whitespace and line characters are removed from all strings. Additionally, certain known strings are replaced with empty strings (i.e. errors from the .xlsx
formulas).
Renaming columns
Once the DataFrame is clean, the columns are renamed according to the list shown above. From that point on, all columns are named like the API keys.