Saturday, May 17, 2008

FIRST BIRTHDAY BLOG


One year ago, I started blogging… I never thought that this blog could have the success that in my opinion it currently has… I started it before my master BI degree, and in spite that today I have lot of subjects to write, in that time it was only my experience in the my company job and the experience from MSDN Forums that I still usually give and receive feedback about operational problems that we face each day.

Today, I really understand the importance of knowledge and the importance of sharing it. In this blog I always try to describe all my evolution/experiences but also try to create a knowledge network through the Internet… its important not only share our knowledge but also stimulate others to do the same. Nothing is more important that do what we really like to do, and If I’m here blogging is because I love business intelligence, and I feel that blogging we can make some difference!

Thanks to my familly and my fantastic girlfriend that always supported me, for all the visitors and for all the persons that directly or indirectly stimulated and keeping the blog alive:

my master BI teacher Mrs. Maria José Trigueiros for all the amazing messages that she gave to me, and teachers as Mrs. Patricia Narciso, Mr. Rui Gonçalves, Mr. Azevedo Rodrigues for the excellent and very good comunication that they had transmitted.
my master BI colleagues
my friend Subhash Subramanyam from India
my friend Amin Jalali from Iran
my friend Bruno Ferreira from Ireland
my friend Ella Maschiach from Israel
my friends from Caixa Geral de Depósitos (Portugal)
my friends from Microsoft Forums
my friends from ExpertExchange website
and thanks to persons like Jamie Thomson, John Hennesey and Chris Webb that had spent some time to visit me here.

Some numbers from google analytics since september/07:
about 7.500 page views
about 2.500 unique visitors (by IP address)
visitors from 89 different countries (mainly from Portugal, USA, UK, India, Iran, Canada, Australia, Germany, France, Israel…(in a next post I will paste here the google analytis reports)

I promise that I’ll try to always improve the quality and diversification of all my posts and in the way they fit in the Business Intelligence concepts.

THANKS!
Pedro

Wednesday, April 30, 2008

SSIS - Updating rows from a FlatFile

Dear Friends,
Today I will describe a subset of an ETL I made and I think that could be useful for you and it will be useful for me if I receive your feedback.

The Problem
I have customer’s data (sent by an application by text files to a specific file system share), that must be imported to a SQL Server 2005 database, more precisely, to the customers database table.

All Fridays, all the data in the text file, must be extracted, transformed and loaded into the SQL Server 2005 database. After the extraction the file must be moved to another directory in the file system. (…\ProcessedFiles)

The text file is a fixed column type, and there is a column that describes the event to follow when ETL update the Customers table.
N (means new customer to insert)
A (means update existent customer)
D (means update/inactivate existent customer)

The Solution
The data will be extracted from the text file into a preTable (“preCustomer”) without any transformation/rules (to avoid risks). For each row imported to this preTable, I will have an extra column that is a foreign key to my audit table information. Next, I read all the records from the last extraction, transform and load into Customer table.

The end user, from an asp application, will be able to correct or ignore the redirected rows, and next time the ssis package executes, these errors will disappear. (Note: the warning and error rows will be stored in redirectCustomer table, until some action (correct or ignore) from the end user)

Logical Data Flow
I have several steps that could be useful to explain, but today I will focus only in the subset that I think is more interesting. Below you can see the logical data flow for the ssis package number 3, that transform and load the customer data from the preCustomer table into customer table.



SSIS Package
And finally the ssis package...
(Note that the schema of ETL_Customer is identically to the ETL_redirectCustomer, for this reason you can union all the two sources in the pipeline)





Operational Risk
Remember that in a business intelligence project 70% of the risks occurred in the ETL module. So pay attention to the ETL and always try to identify the existence of an operational risk. I use the ssis error logs only for the real unexpected errors, if you think that there is a probability of some task fail, don’t wait to see the error log, try to mitigate the error before it!!!

“Operational risk is defined as the risk of loss resulting from inadequate or failed internal processes, people and systems or from external events. This defi nition includes legal risk, but excludes strategic and reputational risk.”
The European Commission’s Directive 2006/48/EC9


Some tips for using the best practices
I selected some important tips that could be followed in order to implement a good ETL project, using the best pratices.
- Always comment you project
- Always create the control flow and dataflow logical model and always before implementation of ssis package.
- For all the documents created, insert each document inside a template with all the important information to share with all the teams involved in the project. It’s very important that when you look for a document, you could identify the subject and scope of the document fast. For example, for the template I created for Logical Data Flow, I added important information as project, package, dataflow, developer/team, date, version, purpose, reference…
- Use naming conventions for the tasks and components of the SSIS Packages
- Use SSIS log provider to catch error and warning events.
- Could be useful implement not only the SSIS log provider, but also some custom logs, using event handlers.
- Test you package! How?! Use SQL Data Generator to generate thousands of rows to your source data. This software is very good, for each column you can select all the possible values that you want to random. (These values could come from a database, text file…)

I hope this post could be useful for you, and I also hope that you can give some feedbak about this.
Next post I will try to write something about data mining!!!
Cheers!!

Thursday, April 10, 2008

BI Education - ExpertExchange Master Certification

Dear Friends,
After 2 months answering questions in www.experts-exchange.com I finally obtained the master certification in SQL Server 2005. This is one of the reasons that I didn’t update the blog so frequently! This is the certification that I won today:

There are 9 different ranks that you could achieve, I’m only in the first rank, but I will try to achieve the higher. Why you don’t join the game? It’s cool, you feel that understand something for some subject and you can help other people. lolol

It’s fantastic receive feedback from the asker as you can see for the image below

I will be waiting for you on ExpertsExchange!
Cheers!

Friday, March 21, 2008

BI Tools - SQL2008 CTPFebruary Bug (SSIS)

Dear friends,
I found the gold… no...no… I found the bug! J This mistake is because I never thought that a simple person like me, here in my small country with the name of Portugal, that sometimes people think the country is a state of Spain would found a Microsoft bug in SQL 2008 (SSIS)... ok… but let’s talk about the bug. Sorry for this introduction…

As I told you in a previous post, I’m developing a SSIS project for my master BI degree, using the new SQL 2008 CTPFebruary version. I found a bug, that I think is critical and very important.
If you already worked with SSIS 2005, you are probably familiarly with the warning messages that advise you for unused columns in your pipeline:






For those people who never saw the messages, I made a very simple example. (Note: One of the best practices to develop an ETL package is to try to minimize the size of each row in your pipeline. If you forget to delete these unused columns you will be decreasing ETL performance. If you already used SSIS 2005 and never saw the messages, take care)
This simple example in previous version of SSIS (2005) only executes a query in the table “Person.Contact” on AdventureWorks and inserts the returned rows into a temporary table.

Steps:
1. Add a dataflow “DFT TestBug” to the ControlFlow
2. Goto the dataflow created in the previous step
3. Add an OLEDB Source to get some data from AdventureWorks database:
SELECT Title, FirstName, MiddleName FROM Person.Contact
4. Add an OLEDB destination to insert the data into a #tempTable. Map the source and destination columns title and firstname and intentionally leave the column MiddleName without map.


As you can see for the image below, the SSIS 2005 advice you for the unused columns with warning messages. In the new version, these messages doesn't appear!!!

I posted this bug in MSDN Forums
http://forums.microsoft.com/Forums/ShowPost.aspx?PostID=3044877&SiteID=1&mode=1
and James Thomson and Bob Bojanic kindly suggested me to submit this bug to Microsoft. I submit the bug as you can see for the link below:

Sorry MSFT team for this post, but for me this messages are very important and it will be a problem if the final version doesn’t correct this!
Cheers!!!
Pedro

Thursday, March 20, 2008

BI Education - The Data Warehouse Lifecycle Toolkit

Dear Friends,
I ‘m very busy with my job and my master BI degree, but I will try to come here with more frequently. I have too many things to blog and I don’t know where I will start from.

For now, I’ll write few words about the recent book of Ralph Kimball. I confess that I always was a fan of Kimball books, but normally those books have a lot of years. Although the subjects of almost the books are in use, I think that they was a little old and could be better adapted and integrated.

Ralph Kimball Group launched a new and fantastic book, “The Data Warehouse Lifecycle Toolkit” (second edition) on last January 2008. I read only four chapters yet, but I really feel that this book is better than others from Ralph Kimball Group. Check the contents of the book in the amazon or wiley website.
Obviously, I bought this book, and it has a fantastic price on Amazon (UK) at
http://www.amazon.co.uk/exec/obidos/ASIN/0470149779/ref=nosim/schildnet0a I paid for the book about 18€ (Euro) with shipment included and received in 2 days!!! (I’m in Portugal… don’t forget it) Normally, I buy the books at Amazon directly to the Amazon partners because is more cheaply, but in this case there is no difference between Amazon official price and the partners price.
I’m preparing a post about Reporting Services 2005 and 2008 querying an OLTP database.
Regards!!

Tuesday, March 4, 2008

BI Conference - SQLBits - Birmingham (UK) - After

Dear Friends,
I’m just arriving from Birmingham (UK) and I only have one Word to describe the experience: FANTASTIC! I have several things to write but if you weren’t there you missed a fantastic day!
As you already know, I’m currently living in Portugal but last Friday I pick my girlfriend and went to Birmingham (UK) to attend the SQLBits conference and back to the city where I’ve finished my degree some years ago. Microsoft Portugal is not investing too much in SQL BI tools, and in the last summer it had only one person specialized in these tools… To prove it, the next TechDays 2008 in Portugal there will be just a few sessions in SQL BI tools and probably too basic… The possibility to hear some “gurus” and the authors of my preferred books from Wrox and Wiley, it was an opportunity that I couldn’t missed!
I will describe in general how it was this journey to Birmingham and I will detail my preferred sessions on detail in a next post.

The Conference
There was several sessions in this conference, and I attended the sessions related with Business Intelligence MS SQL tools. Darren Green, Allan Mitchell and Chris Webb didn’t disappointed me in their sessions, and the money expended in this journey was a very good investment on my career and a good opportunity to “open my eyes” more and more to the world.

Allan Mitchell Session
Allan spoke about the new features of SSIS 2008 and compares it with the SSIS 2005. He shows some details that usually some developers don’t pay much attention, the performance… he show some details behind the scene, and in my opinion he done it good!
MultiCast
Pipeline Limiter
LookUp Component (Cache Options)
Data Profiling Task
Change Data Capture

Chris Webb Session
This session was very interesting… I expecting to see something about new features and good advantages of SSRS, and what I saw was the opposite and I’m totally in agree with him and this session prove that this type of events are not to sell products but is a place where we can receive very good and sincerely tips and tricks from developers to developers. And Microsoft was one of the sponsors…

Darren Green Session
Darren session was very cool and he spoke about a subject that I normally in SSIS avoid but after the session I will take more attention… and the subject is custom data components. I bought the WROX book that Darren wrote and refer this subject but I didn’t read those chapters… But is really interesting , not only to understand better the pipeline engine but also to develop more specific and complex transformations.





The City and BullRing
As I told you before, I finished my degree in Birmingham some years ago. There are several improvements in the city but it still beauty! A very good improvement was the BullRing Center that is a big shopping place with an area equivalent to 26 football stadiums!!!





The Hotel
Normally, hotels in England are not comfortable or modern, but this Hotel is very modern and I knew a Portuguese man that works there!



The NightLife
The night bars and discos are incredible on pressure of safety guards…



So... in resume, was a fantastic journey!! I will try to detail more about the conference in a next post!
Cheers!!

Monday, February 18, 2008

BI Conference - SQLBits - Birmingham (UK)

Dear Friends,
Today I'm very happy!! I received the confirmation for the next SQLBits conference in Birmingham(UK). And I'm happy for several reasons...

  • I will return to Birmingham where I finished my IT Degree some years ago (NewMan College 2001)
  • I will attend to the sessions of Chris Webb and Darren Green that are one of my main sources of inspiration and Allan Mitchell that is one of the authors of my favorite SSIS book at WROX.
  • Change knowledge with the BI comunity and see old and new friends.
  • Learn more about BI and SQL 2008 tools
  • Enjoy Birmingham lifestyle again!
I'm sure you have a lot of reasons like me to the conference. Dont waste more time, there are more than 400 people confirmed!!

For registations follow this link: http://www
.sqlbits.com
There will be several sessions, I selected these ones:

Slot 1:
Master Data Management (MDM)
Taken by Sutha Thiru
Slot 2:
Extending SSIS with custom Data Flow components
Taken by Darren Green
Slot 3:
Understanding LINQ in .NET Framework V3.5 and Beyond
Taken by Mike Taulty
Slot 4:

Making more of SSIS in 2008
Taken by Allan Mitchell
Slot 5:
Using Analysis Services as a data source for Reporting Services reports
Taken by Chris Webb

The agenda: www.sqlbits.com/information/MainAgenda.aspx
I will book my flight and hotel tonight!!

And for some reason you cannot go to Birmingham, prepare to go to Glasgow on 10-May-2008... I will try to be there... so, vote in the sessions you want to see!! Follow the link image!

See you around!
Regards!

Wednesday, February 13, 2008

SSIS - Execute a package from an ASP.NET page

Dear Friends,
Today I'm describing the way I executed a package from an ASP.NET page and passing a parameter.

The Problem
I need to enable the user of an ASP application to execute a package and passing a parameter to it when the user press a button.

Solution
The first thing you must know, is where you ssis packge will be stored. You or the DBA Team could stor the SSIS package inside database, in the filesystem or in DTS. Each one has advantages and disadvantages, and in my opinion the package inside database could be better when you must unsure more security.

So, in my SSIS package I have created a global variable "strCFG_TextPath" and I will update this value from ASP page. I have created a function fExecSSISPackage in a clean webservice to make the call.
(You can create the function in the ASP page, but in my case I made like this because I insert all the connections to the database inside a WebService)

WebService

In this example the SSIS package is stored in the filesystem, but the function is prepared to accept each of the 3 types that I described before.

You only need to write few lines: (SourceLocation is the string value I want to pass to the SSIS package)

Dim myPackage as Package
Dim integrationServices As New Application
myPackage = integrationServices.LoadPackage(packagePath, Nothing)
myPackage.Variables("strCFG_TextPath").Value = sourceLocation
myPackage.Execute()

ASP.NET Page
To call the function in the WebService is simple... see the image:

I hope this post could be helpfull for you.
Kind Regards,
Pedro

Tuesday, February 5, 2008

BI News

Dear Friends,
Sorry for this delay to write a post... you dont imagine the lot of things I have to share and change with you. This delay is because in the last week I was reading a new book... a book that tells me how I can earn money in the financial markets!! eheheh... the book is very good, and was written by a person that worked in the same bank that I'm currently working as outsource BI developer... I'm working for the big IT company in Portugal... It's Novabase (http://www.novabase.pt/)

The book is really good and explain how the markets work and explain some tips that we must care when we invest some money in equities or other financial instruments. I bought this book because I'm currently working as BI developer in a project in this business. This book is writetten in portuguese, but I will try to know if there is some version with english translation.(http://www.keditora.com/nov.html)
But the main purpose of this post is to tell tou that I will start to write more technical posts related to ETL using Microsoft tools. Not only because I have a ETL class in my master degree but because I have some tips that I think could be helpfull to you, like SSIS package configurations, SSIS audit packages, SSIS named conventions, and more and more... so, keep in touch with my blog!!!
Cheers!
Pedro

Friday, January 25, 2008

Microsoft TechDays 2008 - Portugal (Lisbon)

Dear Friends,
This post is only to inform you that next March 2008, it will take place the bigger Microsoft Conference in Portugal. Last year I was there, and it was amazing!!! If you can go there, don't hesitate, it will be very good for sure.

In the conference will be announced the new SQL Server 2008, Visual Studio 2008 and Windows Server 2008. Download the agenda http://www.techdays.pt/Download/AgendaDraft.pdf
Official Site: http://www.techdays.pt/default.aspx
Regards!!

LinkWithin

Related Posts Plugin for WordPress, Blogger...