"What you need in Business is more information than the other guy. Not more smarts. Not more intuition. Just more information".
Welcome to my blog! Enjoy it and leave your mark commenting it!
Saturday, October 4, 2008
BI Education - ExpertExchange Guru Certification
Monday, September 29, 2008
BI Education - SQLBI Methodology
I’m comment Marco Russo and Albert Ferrari draft, only because I want to contribute with my opinion for their project and advertise it. And I would be happy if professionals and friends like Subhash, Bruno, Amin, Tomislav, Alex, Ella and all other experts give their opinions about this subject.
It’s an approach near the real world and my next comments are only to clarify myselft and make this draft even better. Like they say in the draft:
Download the draft in the official website
Some comments/questions
1. The main use of the configuration database is for dealing with bookkeeping information and with data that come from more than one source? An example is to deal with the same customer that comes from two different sources with a different name… Which more funcionalities this configuration database could have? I think this configuration database should be shared with several BI solutions and use SQL schemas to split them. If you have one database configuration for each BI solution you’ll have more cost and more work to maintain it.
2. I’m agree that mirror database, filters only the required data, and the staging area is a relational database that will be a bridge between mirror and datawarehouse denormalized database. If you need to process all the cubes again, datawarehouse database stores all the historical data or you need to go again to mirror database or operational sources?! (sometimes the historic data in operational sources is available only once)
4. Your definition of Metadata OLTP DB makes me a confusion… do you have views to OLTP operational systems to work as a bridge to DW in case of not using a staging area, but normally metadata is contextualizing your data… describing what each field, each table, each database is … I think you could explain this better.
5. Following your draft, in Kimball approach, the Data Warehouse database has a denormalized structure in spite Inmon aproach that has a normalized structure. Am I correct? There is some confusion in this point…
6. I don’t think that in SCD type 2 the status column should be used. I only understand the use of this column in this type of dimensions, only for users that goes inside database and need a fast understand of the current row, because in practice, if you lookup for the status column, you may have a problem if you need to import rows from a previous date (late arriving facts)… and you will map this rows to the status and could not be true
7. My defition of ODS was a little different. Your ODS description (only used by softwares) may cause several views of “true”. What do you think?
8. I’m completed agree with you about Natural and Surrogate Key in date dimension. In my last project I used Surrogate Key as an identity integer, that make me felt lost. I always needed to join the fact table with date dimension to understand the mean of each fact.
9. In page 36 when you wrote “Mirroring is not part of the ETL. “, I think create the mirror is the extract step of an ETL (ExtractTransformLoad)… you don’t do any transformations… only the basics like change column names and extract the data, and if you can apply some filters. Changing column datatype could generate an error… and we must avoid accessing source operational systems more than once
10. In pag 42, in which database you store logs added by SSIS Log provider? (Sysdtslogs). I like the answer you gave to the question “what is the difference between log and audit?” and also the use of audit information directly in the cubes but hidden for business users. Some people are a little confused in this subject.
11. How BI analysts control the data lineage? And how we can make an impact analyst on a change in a database?
Improvements
I think that some comments could be basic for some of you, but because there aren’t any standards and any ISO for this subject, BI Analyst still some has doubts about the best pratiques to follow implementing a BI solution
So, I hope to receive Marco and Albert here to clarify me the points described before. It’s a very important step that these guys are doing, because they are creating rules, standards and make we speak the same language.
Updated 29-09-2008 with Albert Ferrari response. I will reply in the comments section.
1. Yes, the config DB can be shared between different solutions for the same customer even because, in the DWH world, "different solutions for one customer" means "One complex solution for one customer, composed of different subject areas". :)
2. In some situation we store in the OLTP Mirror several snapshots of OLTP tables, in order to be able to rebuild the history. Clearly the DWH contains all the history but, sometimes, you will need to rebuild the DWH too...
3. I would not avoid schemas in order to satisfy third party tools. If some tool need to query the DWH then it will have its views and, if it cannot use schemas, then its views will live in the dbo schema.
4. OLTP Metadata is just a set of views. If you cannot build them in the OLTP DB and you cannot afford an OLTP Mirror, then an OLTP Metadata might help.
5. The Kimball approach is de-normalized but the extent of normalization that is needed in an Inmon one is not very clear, it all depends from the analyst. Inmon leaves all these "details" to the analyst, giving just some advices.
6. I agree with you. What else could I say? :)
7. What is your definition of ODS?
8. Yep. Even if, sometimes, I had to change the key, using some very creative way of expressing a date. If you have several billions rows then the INT representation of YYYYMMDD might be a problem and a smallint solution might be helpful. Clearly 20080101 is not a smallint... so you'll need some differente ways to express the date.
9.I don't understand your note. During mirroring you should not do any kind of transformation and/or column rename. If a column is called C1P8D3B0 ans is of type INT, then you will have that column name and type in the OLTP mirror and a VIEW that will change name and type when you will extract from the mirror.
10. I think the log DB is the perfect place for thos logs.
We are about to publish a second whitepaper that might be interesting to look at, it will provide a demostration of the methodology on the very (ab)used AdventureWorks DB.
Thanks alot for your comments, I will use them and your improvement section for a future review of the whitepaper.
Alberto
Thursday, September 25, 2008
BI Conference - Microsoft Events
And for now… and because I don’t have budget for more events this year, I will be next week here in Lisbon to see Steve Ballmer, Microsoft CEO.
Thursday, September 18, 2008
BI Book - Free SQL 2008 Book
http://csna01.libredigital.com/?urss1q2we6
Wednesday, September 17, 2008
BI Conference - SQL Bits III
The University of Hertfordshire
This is the University where the conference toke place. Is a nice place but too quite for a person that goes alone from another country… Bruno was also there but he went to Irland on Saturday after the conference…


Last SQL Bits, in Birmingham (UK), the location of conference was better because it was near the city centre and because I went with my girlfriend, she could went to shopping while I was attending the event




The Training Day (Friday)
SQL Bits III organization included an extra day for intensive training that cost between 150£ and 200£ depending on the selected subjects. For those persons that never saw MSFT BI Tools and Business Intelligence approachs from Kimball an Inmon, I think that was good, but for others was extremely basic. But was my fault, because Chris Webb advised me… but I’m disappointed because I didn’t see the same organization and the same environment in training day (on Friday) than the free conference day (on Saturday) … and for those (like me and Bruno) who payed for the training day we make a question for ourselves… for which day we payed 150£?! Why we make the question? The reason is simple… the free conference had several partners (Microsoft, Quest, Xlcubed and others), extra events like dinner with beers and pizzas, more than 400 persons, a party and some other interesting stuff. We also thought the training day would be an intensive training using the laptop and making some advanced labs mainly in MDX and Data Mining.
The Conference Day (Saturday)
Some photos of conference environment and a photo of Bruno picking some pizza and beer! Bruno is the guy with black t-shirt! You can also see the taller guy… He’s Alex Whittles that I referenced before and he’s a cool and friendly guy!




The Conference Day Sessions
I’ll describe a a little resume of each session that I attended, maybe I will go into detail in a next post, but for now, I’ll do a short briefing.
Alan Mitchell session
Getting more value from your data through data mining
After seeing the Rafal Webcasts, it’s difficult to find a better session about Data Mining, but Allan was really great and focus on excel 2007 add-In for Data Mining. Allan showed how the business user can apply a Data Mining algorithm without statistical and developing skills. Don’t need to know formulas, don’t need to know programming, it’s all there!!!


The hard task will be the ETL that is the most important step in a Data Mining project. If the dataset is not well created, even the top statistical expert of the world cannot identify real patterns and trends
Gary Crawford session
Data Vizualisation - Make your reports useful to business - No more "Speed Dials"
Maybe some people didn’t saw anything new, but I loved this session. Gary gave some very good tips that must be taken when you create a report or dashboard. He started from default charts generated by excel wizard and gave some design tips to create more friendly and understandble charts for business reports


He also showed some gauges, thermometers and traffic lights usually used in some business reports. He explain some best pratiques that should be followed in order to avoid confused and unperceptable reports




Andrew Sadler session
Microsoft's integrated end-to-end BI offering
I never heard about Andrew Sadler before, and was really a very good surprise for me. He was fast, precisely and efficiency, in the way he explained the integration that could be done with Sharepoint, Performance Point Server and ProClarity


In the demo prepared for the session, he used the Sharepoint portal as the main interface/front office and behind the scenes he created several different types of reports and charts in Performance Point Server (including proClarity) showed inside normal webparts of Sharepoint portal




Andrew Wiles session






In the final, Simon Sabin, invited the attendences to eat some pizza, drink some beers and singing a song in auditorium!
Thanks to the organanizers, that although some little disappointing points, I'm sure that they did everything they could to do a fantastic event as it was in SQLBits III. Thanks to Simon, Chris Webb, Allan Mitchell, Tony, Darren Green and Jamie.
Friday, September 5, 2008
BI Certification – Microsoft MCTS 70-445 Exam Passed
One of the goals that I’ve proposed to achieve this year is almost accomplished! Today I made the Microsoft exam 70-445 (MCTS), and next week I will do the exam 70-446 (MCITP).

The exam 70-445 that I done today includes 3 courses:
Course 2791: Implementing and Maintaining SQL 2005 Analysis Services (3 days)
Course 2792: Implementing and Maintaining SQL 2005 Integration Services (3 days)
Course 2793: Implementing and Maintaining SQL 2005 Reporting Services (3 days)
http://www.microsoft.com/learning/en/us/exams/70-445.mspx
The following list includes the topic areas covered on this exam. The percentage indicates the portion of the exam that addresses a particular skill.
Managing SSAS (12 percent)
Developing SSAS Solutions by Using BIDS (18 percent)
Implementing data mining by Using BIDs (12 percent)
Managing SSRS (13 percent)Developing Reporting Solutions by Using SSRS (19 percent)
Developing Business Intelligence Solutions by Using SSIS (16 percent)
Administering SSIS Packages (11 percent)
The exam wasn’t easy, and I confess that during the exam I was a little afraid, because the questions were very specific and it required a lot of attention! Happily I passed with 842 (84,2%) (minimum is 700)
I have some experience in these subjects for that reason I only studied with MCTS Training Kit. This kit has a book, several labs and about 200 exams to train.

My exam had 44 questions and I felt that almost questions were focused on Reporting Services and Data Mining. I suggest that you get this training book and go inside into the properties of each module. (SSIS, SSAS, SSRS and Data Mining) And some subjects that you must study hard:
Report Model and Report Builder
Data mining algorithms and parameters
Data Mining lift and profit chart (Check the accuracy of your DM Model)
SSIS Transactions
SSAS Security issues
SSAS Performance (Usage based optimization / design aggregations)
SSAS MOLAP (automatic and schedule), ROLAP and HOLAP
DMX Syntax (for data mining models)
SSRS Subreports
If you follow the training kit and focus on the tips above, and are familiar with Microsoft tools I think is sufficient to pass. But some visitors will ask... why SQL 2005 and not SQL 2008? Because SQL Exams is not available yet, and it’s a way to prepare to SQL 2008 and learn more about Microsoft tools. Following this, I already made my registration for the new exams that will be available this month.
Register for the "First to Know" Visual Studio 2008 and SQL Server 2008 Exam Offer
Be Among the First to Certify
http://www.microsoft.com/learning/mcp/sqlvs/offer/default.mspx
Thursday, August 28, 2008
BI Conferences - SQLBits and SSWUG
SQLBits

For this reason I already have my flight and hotel booked for the next SQLBits (www.sqlbits.com) trainingday (150£) and conference (Free) in London next 12/13 September 2008.
Some topics could be basic for me, but it’s important to follow the new BI improvements and mainly the details that in most cases make the difference! And is good to train my English and meet some people to dinner and spend some time sharing experiences. For now I have the confirmation of my friend Bruno Ferreira that is working on Ireland, Alex from UK (http://www.purplefrogsystems.com), some friends from ExpertsExchange, and other people from BI comuninity
SSWUG

If you cannot be on the SQLBits, dont be ungry... you can follow other events like SSWUG conferences from home. There will be four conferences for a simbolic price of $100 (in Euros is less than 100€) and all the conferences are for 3 days and I’m sure it will be fantastic. It’s not difficult to imagine that my preferred conference will be the Business Intelligence conference on September 24-26... but I’m confess that I dont know If I will attend or not... it’s not the same thing a conference from a laptop or a conference on site as I will do in UK on SQL Bits... but I’ will try to attend!!!
Sessions
http://www.vconferenceonline.com/business-intelligence/sessions.asp?offset=-1
Regards!
Thursday, August 14, 2008
Data Mining - Rafal Event
http://fr.youtube.com/watch?v=Vs46QZwnEcc
Introduction to Data Mining
Working with Data Mining
Using Data Mining in Your IT Systems (Part 1)
Using Data Mining in Your IT Systems (Part 2)
http://www.microsoft.com/emea/spotlight/event.aspx?id=99
After the Rafal webcasts, I dont have nothing more to write about it! But I’m currently doing a work for my master BI degree to predict the stock market behaviours... could be a good way to became rich fast!! I’ll tell about it in a next post...