I’ve been a reasonably frequent publisher of blog entries at Mark Polino’s site related to Analysis Cubes, but will also start posting here as well.
We just finished another one of our Fixed Fee Packaged Offerings for Analysis Cubes for Excel. In these very competitively priced (and quite valuable) offerings, we do a complete deployment of Analysis Cubes including installation, training, custom modifications to the cubes and custom report writing. In this last one, we uncovered (and then fixed) a slew of issues. I’ll list some of the bigger ones here:
The Data Source View is Improperly Built – Part 1
We noticed the processing time on the cubes was WAAAAY too long. In fact, it almost crashed the server. We drilled down on the problem by looking at the query that was being built to populate the Total Expenses dimension and noticed it was returning about 88 million rows. Further examination showed that the offending item was the join between the Total Expense table and the Fiscal Time table. The join was based solely on the company ID field in both tables. The problem with this is that the Total Expense table contains one record per AP voucher per company. The Fiscal Time table contains one record per company per day since the cut off date selected in the install. So, if you’ve been operating for 4,380 days and you have 12,000 AP vouchers in that company, you get 144,000,000 records returned (4,380 * 12,000) just for that one company. Multiply by the seven companies with which we were working and you can see where the problem really comes in. The correct join should be included the Document Date from the Total Expenses table and the Full Date from the Fiscal Time table.
The Data Source View is Improperly Built – Part 2
In resolving the above, we noticed that many of the tables had similar, incomplete joins. As we started looking further, we then found then often the correct join (like Company ID and Document Date to Company Id and Full Date) was spread across two different connections. The problem with this is Analysis Services only uses on (the first) connection in the DSV to build its query against the warehouse. So, defining compound keys in separate connections is about the same as defining them incorrectly in the first place.
Fixing the above two issues by fixing the DSV reduced the processing time from 3+hours and crashing the server to 39 minutes.
Packages and Encryption
We installed ACE under a domain account with Windows Admin and a bunch of other privileges (per the manual). We could run the SSIS packages. We would create and run the job. We could schedule the job and have it run. However, the second we logged out of the server console, the job would fail.
That’s right, the job ran only if the install user was logged in to the SQL Server console. So, I could login to the console and have nothing open. Then, I could start SQLMgtStudio from another workstation logged in as another user and start the job. Everything went fine. But if I tried to run the job WITHOUT the install user being logged into the console, it failed. Turns out the encryption level (Encrypt Sensitive Data with User Key) for the SSIS packages is seriously buggy and pretty much prevents you from doing anything you really need to do. Even setting up proxy credentials didn’t work, so we had to have Tech Support decrypt the packages for us.
Bad Data
Other blog entries I’ve done addressed the Retained Earnings issues with ACE, but we found a new one. The Customers Over Credit Limit Dimension was not returning the right data. When we looked at the decrypted package, we found out the the sign was wrong: Customer Balance > Credit Limit was being set to No rather than Yes. That was a pretty quick change to the package (although you would easily do the same by using an update script on the CustomerMaster table in the warehouse).
The above applies to GP 9.0, but we found many of these same issues in GP 10.
We now do all these fixes as part of our standard deployment, but though the above might save you some time.
Dwight