Business Intelligence Lab - 2nd module of Business Perfomance Analysis (Analisi delle Prestazioni Aziendali)
Graduate Program in Business Informatics, Academic Year 2016-2017, first semester
New web page starting from AY 2017/18.
- Lessons will be held at: Polo Didattico "L. Fibonacci", Via F. Buonarroti 4, Pisa.
- Timetable:
- Tuesday, Room M-Lab, hours: 16:00-18:00
- Thursday, Room H-Lab, hours: 9:00-11:00
- Teachers:
- Salvatore Ruggieri. Office hours: Tuesday 14-16, Room 321/DO, Dept. of Computer Science. Telephone +39-050-2212782.
- Anna Monreale. Office hours: Monday 14-16, Room 374/DO, Dept. of Computer Science. Telephone +39-050-2213119.
- Preliminary program and
calendar of lessons.
- Notice: the 1st and the 2nd module of APA must be passed within the same academic year (by April 2017 for students attending in AY 2015/16, by April 2018 for students attending in AY 2016/17).
- Software used:
- Eclipse IDE for Java developers and Java
- SQL Server 2016 Developer Edition:
- Mandatory: SQL Server 2016 Management Studio and SQL Server 2016 Data Tools (no need to install Microsoft Visual Studio 2015! choose option 4 "Download SSDT as an ISO image")
- Optional (not recommended on laptops): SQL Server 2016 Developer Edition can be downloaded from Microsoft or can be downloaded from MSDN-AA. During installation, set the following options as a minimum.
- Microsoft Excel
- Anaconda (choose Python 3.5 version).
-
Connection to wi-fi,
and F.A.Q.s about the labs.
- Audio-video records of lessons can be found here.
- Mid-term written tests (registration is mandatory)
- First mid-term results.
- Previous years: 2016/17 text, 2015/16 text and 2015/16 solution, 2014/15 text and 2014/2015 solution, 2013/14 text, 2012/13 text and 2012/13 solution.
- Second mid-term: 22 December h. 9-13 room M-Lab register here
- Previous years: 2015/16 text, 2014/15 text, 2013/14 text, 2012/13 text and 2012/13 solution.
- First mid-term results.
- Regular exam sessions: the exam consists of a written test and an oral test. Dates of written tests (registration is mandatory):
- 5th July 2017, hours 9-13, Room H-Lab register here.
- 6th September 2017, hours 9-13, Room H-Lab register here.
Tweet
Lessons and teaching material
The teaching material is freely accessible from the unipi.it domain. Outside it is password-protected (ask the teacher for credentials).
Notice for Windows on the PC at labs: save all your files (Eclipse workspace, Visual Studio solutions) to your home (disk E:), not on Desktop or in MyDocuments because these directories may be cleaned from time to time.
- Tuesday 20 September 2016, M-Lab, 16-18 (A. Monreale)
Introduction. File data access. Representation formats: CSV, FLV, ARFF, XML.
Slides: Introduction, BI architectures. File Data Access.
Sample data: data1.zip.
References:- For an introduction to the technology of BI:
- S. Chaudhuri, U. Dayal, V. Narasayya. An Overview of Busines Intelligence Technology, Communications of the ACM, August 2011.
- For file access:
- A. Silberschatz, P. Baer Galvin, G. Gagne. Operating System Concepts, 9th ed., 2013. Chp. 11: File System Interface.
- For file representation formats:
- P. Murrell. Introduction to data technologies, 2013. Chps. 5, 6.
- Weka ARFF format, XRFF format.
- For an introduction to the technology of BI:
- Thursday 22 September 2016, H-Lab 9-11
NO LESSON ON THIS DATE DUE TO LAB UNAVAILABILITY. THIS DATE WILL BE RECOVERED ON TUESDAY 27 SEPTEMBER, M-Lab 11-13.
- Tuesday 27 September 2016, M-Lab, 11-13 (S. Ruggieri)
Java and Eclipse recap. Lab practice: maximal subsequence.
Slides: Java recap.
References:- For basic Java programming, the following is an option (but many other good books are OK):
- C. S. Horstmann, G. Cornell. Core Java Vol. 1: Fundamentals. Prentice-Hall. 8th or 9th edition. Chpts. 1-6,11-13
- For text/binary/xml file data access in Java:
- C. S. Horstmann, G. Cornell. Core Java Vol. 2: Advanced Features. Prentice-Hall. 8th or 9th edition. Chpts. 1,2
- For basic usage of the Eclipse IDE
- See the Help contents from the Help menu
- For basic Java programming, the following is an option (but many other good books are OK):
- Tuesday 27 September 2016, M-Lab, 16-18 (S. Ruggieri)
OOP in Java. Lab practice: relational algebra.
- Thursday 29 September 2016, H-Lab 9-11 (S. Ruggieri)
File data access in Java. Lab practice: CSV2ARFF file format conversion.
Software: lbi.utils.zip. - Tuesday 4 October 2016, M-Lab, 16-18 (S. Ruggieri)
Lab practice: XML2CSV/CSV2JSON file format conversion.
Software: lbi.relational.zip and lbi.file.format.zip
- Thursday 6 October 2016, H-Lab 9-11 (S. Ruggieri)
RDBMS access protocols: ODBC, OLE DB, JDBC. JDBC Programming.
Slides: RDBMS access.
Software: lbi.jdbc.zip
References:- For JDBC programming in Java:
- C. S. Horstmann, G. Cornell. Core Java Vol. 2: Advanced Features. Prentice-Hall. 8th or 9th edition. Chpt. 4
- For JDBC programming in Java:
- Tuesday 11 October 2016, M-Lab, 16-18 (S. Ruggieri)
Lab practice: stratified sampling in JDBC.
Software: lbi.jdbc.sampling.zip - Thursday 13 October 2016, H-Lab 9-11
Introduction to SQL Server. ETL tools: SQL Server Integration Services (SSIS).
Slides: SQL Server, SSIS.
Database backups: sample SQL 2016 DBs.
References:- For SQL Server:
- Technical docs and tutorials.
- Management Studio: documentation and tutorial. Notice: By default, SQL Server Management Studio poses some restrictions to the modification of table schema, returning an error after modifications. To remove such restrictions de-select Tools->Options->Designers->"Prevent Saving changes that require table re-creation".
- SQL Server Data Tools: documentation
- For SSIS:
- SSIS: documentation and tutorial
- B. Knight et al. Professional SQL Server 2014 Integration Services, Wrox publisher, 2014. Chps. 1-7.
- For SQL Server:
- Tuesday 18 October 2016, M-Lab, 16-18 (S. Ruggieri)
SSIS samples and lab practice: pipeline, sampling.
- Thursday 20 October 2016, H-Lab 9-11 (S. Ruggieri)
SSIS samples and lab practice: surrogate keys, slowly changing dimensions.
- Tuesday 25 October 2016, M-Lab, 16-18 (S. Ruggieri)
SSIS samples and lab practice: slowly changing dimensions and running total. Change data capture.
Software: VS solution with all SSIS examples: 2016SSIS. Notice: an error is normally raised when you click on a package due to encrypted password: re-enter your login and password in each connection manager and then press the refresh button. - Thursday 27 October 2016, H-Lab 9-11 (S. Ruggieri)
Datawarehousing and OLAP recap. Data cubes, analytic SQL, and materialized views in SQL Server.
Slides: DW and OLAP.
Software: sql query examples.
References:- For DW and OLAP background:
- Decision support databases course lecture notes.
- For DW and OLAP background:
- Tuesday 8 November 2016, M-Lab, 16-18 (S. Ruggieri)
OLAP with SQL Server Analysis Services (SSAS): data source views, dimensions, hierarchies. Data cubes.
Slides: SSAS.
References:- For SSAS:
- SSAS (olap): documentation.
- S. Harinath et al. Professional Microsoft SQL Server Analysis Services 2012 with MDX and DAX, Wrox publisher, 2012. Chps. 4-6.
- For SSAS:
- Thursday 10 November 2016, H-Lab 9-11 (S. Ruggieri)
OLAP explorative data analysis with Pivot Tables in Excel. Calculated metrics. Parent-child hierarchies.
Software: FoodmartExplorative.xlsx.
Software: solution SSDT with SSAS project developed so far.
Notice: to avoid conflicts in deployment/process follow this steps once the solution is opened: (1) rename the project as <your account>_foodmart; (2) from project properties select 'Deployment', then rename the database as <your account>_foodmart; (3) click on the button "show all files" just above "Solution explorer" right click on "view code" on the .database file that is visualized, and then change the ID from ruggieri_foodmart into <your account>_foodmart, and finally save the file; (4) change the credentials of connection to database on SQL Server. As an alternative solution you may import the project from the SSAS server and rename it as <your account>_foodmart (step 4 is still necessary).
References:- For Pivot Tables in Excel:
- G. Harvey. Excel 2013 All-in-One For Dummies, 2013. Chp. VII-2.
- For Pivot Tables in Excel:
- Tuesday 15 November 2016, M-Lab, 16-18 (S. Ruggieri)
ROLAP and MOLAP in SSAS. MDX queries.
Software: MDXSample.mdx.
References:- For MDX:
- MDX: documentation and a useful guide on ordering.
- S. Harinath ed altri. Professional Microsoft SQL Server Analysis Services 2012 with MDX and DAX, Wrox publisher, 2012. Chp. 3.
- For MDX:
- Thursday 17 November 2016, H-Lab 9-11 (S. Ruggieri)
Practice with SSAS and MDX.
Software: MDXPractice.mdx - Tuesday 22 November 2016, M-Lab, 16-18 (S. Ruggieri)
Practice with SSAS and MDX. Reporting with SQL Server Reporting Services (SSRS).
Slides: SSRS.
Software: SSRS examples.
Notice: to avoid conflicts in deployment follow this steps once the solution is opened: (1) rename the project as <your account>_report; (2) from project properties select 'General', then rename the TargetDataSourceFolder as <your account>_datasources and the TargetReportFolder as <your account>_report.
References:- For SSRS:
- SSRS: documentation and tutorial
- For SSRS:
- Thursday 24 November 2016, 9-11 Room "Sala Gerace", Dept. of Computer Science
Seminar (in Italian): Progettazione e tool utilizzati per la realizzazione di sistemi di BI nel settore pubblico e privato. Nicola Matarese (Tai, www.tai.it)
- Tuesday 29 November 2016, M-Lab, 16-18 (A. Monreale)
Microsoft Azure Platform & Azure Machine Learning Studio: Classification in Azure ML.
Slides: AzurePlatform-PaaS, Azure-ML-Studio.
Software: Azure-ML-code. - Wednesday 30 November 2016, 16-18 Room "Sala Gerace", Dept. of Computer Science
Seminar (in English): How Big Data and Advanced Analytics are shaping tomorrow's business models. Alessio Botta and Elisa Piscitelli (McKinsey, www.mckinsey.it)
- Thursday 1 December 2016, H-Lab 9-11 (A. Monreale)
Practice on classification using AzureML. Solution of practice on classification.
Slides: Practice.
Software: practice data and practice code. - Tuesday 6 December 2016, M-Lab, 16-18 (A. Monreale)
Association Rules using AzureML & Python.
Slides: PracticeSolution, AssociationRules.
Software: practice data and practice code. - Tuesday 13 December 2016, M-Lab, 16-18 (A. Monreale)
Deployment of Data Mining Model on Azure.
Slides: Azure-ML-API.
Software: practice data and practice code. - Thursday 15 December 2016, H-Lab 9-11 (A. Monreale)
Clustering using AzureML & Python.
Slides: Clustering.
Software: practice data.