Generate an excel report using python code
Hello Python enthusiasts,
In my new blog I would like to cover a new scenario from our routine work where we have to generate excel reports for our Business from an output csv file.
In my project lot of excel reports are generated using the output file generated from Informatica powercenter tool, where in command task we execute a shell script and give the name of output file as an argument to that shell script which generates the excel report.
And I was just looking through the script and it bounced over me, I hope I am not sounding alien to my readers. I really found writing and understanding shell scripts as cumbersome and tedious. And things get further more complicated when you start getting an error in if-else just because of the underlying shell and you have to search google for the syntax or the compatible shell. I am not saying shell scripts are not good, they definitely are robust but just to make the things more readable and understandable I thought of why not to simplify things. And I performed such scenario at my home as a POC (proof of concept) and as I learnt something new so being over-enthusiast I thought of share my learning with all those who believe in upgrading themselves and simplifying things.
Not wasting further time, let check out the scenario!!!
My source is a flat-file generated from an Informatica mapping and I have to convert it to an excel report. As of now I am using simple data set of department data having few records.
1. Data set as below:
deptno,dname,loc <--header
10,IT,NDelhi
20,HR,Ndelhi
30,Admin,Noida
2. python script as below:
===========================script as below===============================
import xlsxwriter as xls
workbook=xls.Workbook('D:\D_Drive\python\python_programming\excel report generation\dept_report.xlsx')
worksheet=workbook.add_worksheet()
row=0
col=0
data=open(r'D:\D_Drive\python\python_programming\excel report generation\dept.txt')
#below command will give a 1d array of elements,
#to write to excel we need to convert it to 2d array of elements
d=data.readlines()
#print(d)
l2=[]
#below loop wil convert it to 2d array
for v in d:
print(v)
l=list(v.split(','))
l2.append(l)
print(l2)
#l2 is now a 2d array/list
#using below for loop we will iterate over the contents of list and write out stuff to excel document
for dno,dname,loc in l2:
worksheet.write(row,col,dno)
worksheet.write(row,col+1,dname)
worksheet.write(row,col+2,loc)
#worksheet.write(row,col+3,loc_id)
row=row+1
workbook.close()
In my new blog I would like to cover a new scenario from our routine work where we have to generate excel reports for our Business from an output csv file.
In my project lot of excel reports are generated using the output file generated from Informatica powercenter tool, where in command task we execute a shell script and give the name of output file as an argument to that shell script which generates the excel report.
And I was just looking through the script and it bounced over me, I hope I am not sounding alien to my readers. I really found writing and understanding shell scripts as cumbersome and tedious. And things get further more complicated when you start getting an error in if-else just because of the underlying shell and you have to search google for the syntax or the compatible shell. I am not saying shell scripts are not good, they definitely are robust but just to make the things more readable and understandable I thought of why not to simplify things. And I performed such scenario at my home as a POC (proof of concept) and as I learnt something new so being over-enthusiast I thought of share my learning with all those who believe in upgrading themselves and simplifying things.
Not wasting further time, let check out the scenario!!!
My source is a flat-file generated from an Informatica mapping and I have to convert it to an excel report. As of now I am using simple data set of department data having few records.
1. Data set as below:
deptno,dname,loc <--header
10,IT,NDelhi
20,HR,Ndelhi
30,Admin,Noida
2. python script as below:
===========================script as below===============================
import xlsxwriter as xls
workbook=xls.Workbook('D:\D_Drive\python\python_programming\excel report generation\dept_report.xlsx')
worksheet=workbook.add_worksheet()
row=0
col=0
data=open(r'D:\D_Drive\python\python_programming\excel report generation\dept.txt')
#below command will give a 1d array of elements,
#to write to excel we need to convert it to 2d array of elements
d=data.readlines()
#print(d)
l2=[]
#below loop wil convert it to 2d array
for v in d:
print(v)
l=list(v.split(','))
l2.append(l)
print(l2)
#l2 is now a 2d array/list
#using below for loop we will iterate over the contents of list and write out stuff to excel document
for dno,dname,loc in l2:
worksheet.write(row,col,dno)
worksheet.write(row,col+1,dname)
worksheet.write(row,col+2,loc)
#worksheet.write(row,col+3,loc_id)
row=row+1
workbook.close()
===========================script ends=====================================
3. code explanation below:
a. code snippet:
import xlsxwriter as xls
workbook=xls.Workbook('D:\D_Drive\python\python_programming\excel report generation\dept_report.xlsx')
worksheet=workbook.add_worksheet()
explanation as below:
- using a python package xlsxwriter to write the data to output excel file.
- using a method workbook.add_worksheet to add a new sheet in the document.
for v in d:
print(v)
l=list(v.split(','))
#print(l)
l2.append(l)
print(l2)
explanation as below:
- looping over data stream to convert it into a 2d list.
- and appending the items into a list named- l2
for dno,dname,loc in l2:
worksheet.write(row,col,dno)
worksheet.write(row,col+1,dname)
worksheet.write(row,col+2,loc)
#worksheet.write(row,col+3,loc_id)
row=row+1
explanation as below:
- took 3 variables dno,dname,loc and iterating over list l2
- list is a 2d list where every element of list in itself contains three elements.
- list l2 data is as below:
['10', 'IT', 'NDelhi\n'],
['20', 'HR', 'Ndelhi\n'],
['30', 'Admin', 'Noida\n']]
- worksheet.write() method will write the items to excel workbook.
- row, col are the variables to write the data to 2 dimensional structure of excel sheet.
Bingo!!! Finally our desired output is generated.
I personally found the script quiet easy to implement however I am still finding ways to make it more simpler, small and performance oriented, which I feel can be achieved using python script.
Thanks for reading my blog and it would be encouraging if I get some feedback/suggestions on my work.
Thanks Readers
Vivek Chaudhary
Comments
Post a Comment