Sunday, November 6, 2011
Tuesday, October 4, 2011
* Have "something" in place that they are not happy with or is costing them too much money.
* Have data in multiple silos that they need to access, consolidate and optimize.
- Is the data you need to access all in one location? - No
- Does the data you have support a majority of questions that will be asked of it? - Don't know
- Would you like answers to questions that occur on a regular basis? - Yes
- Would you like your users to answer their own questions on a random basis? - Yes
- Would you like your users to explore and discover answers to questions they did not think to ask? - Yes
- Do you have a predefined set of KPIs to manage and track business performance? - Yes
- Would you like your executives to see an at a glance view of those KPIs? - Yes
- Would you like to be aware of "something" when a defined threshold is met? - Yes
Director of Enterprise Solutions
Sunday, September 25, 2011
Denial of Service attacking (DoS), IP Spoofing, Comment Spamming and Malware programming... are malicious activities designed to disrupt services used by many people and organizations. If you are taking advantage of the internet to run your business, create an awareness of a product or service or simply keep in touch with friends and family, your systems are at risk at becoming a target.
Successful internet "intrusions" can cost you money and even steal your identity. DoS attacks can prevent internet sites from running efficiently and in most cases can take them down. IP Spoofing, frequently used in DoS attacks, is a means to "forge" the IP address and make it appear that the internet request or "attack" is coming from some other machine or location. And Comment Spamming, oh brother...where programs or people flood your site with random nonsense comments and links with an attempt to raise their site's search engine ranking or increase internet traffic to their sites:
"Nice informations for me. Your posts is been helpful. I wish to has valuable posts like yours in my blog. How do you find these posts? Check mind out [link here]"
Huh? - LOL
You may already have defensive measures in place to address some if not all of these things. There are programs, filters and services that you can use to look up, track and prevent this sort of activity. However, with the continuous stream of unique and newly produced malware, those programs and services are only as good as the latest "malicious" activity that is captured. No matter what, it will eventually cause headaches for many people and organizations around the globe. Being able to monitor when something is "just not right" is a great step in the right direction.
In September of 2010, I introduced the Pentaho Evaluation Sandbox. It was designed as a tool to assist with Pentaho evaluations as well as showcase many examples of what Pentaho can do. There have been numerous unique visitors to this site, both legitimate and some as I soon discovered...not. Prior to the site's launch, using Pentaho's Reporting, Dashboard and Analysis capabilities, I created a simplistic Web Analytic Dashboard that would highlight metrics and dimensions of the Sandbox's internet traffic. It was a great example to demonstrate Pentaho Web Analytics embedded in a hosted application. Upon my daily review of the Site Activity dashboard which includes a real-time visit strip chart monitor, I noticed an unusually large spike in page views that occurred within a 1 minute time-frame.
Now that spike can be normal, providing a number of different people are surfing the site at the same time. However it caught my attention as "unusual" due to what I knew was normal. The dashboard quickly alerted me of something I should possibly take action on. So I clicked on the point at the peak to drill-down into the page visit detail at that time. The detail report revealed that who or whatever was accessing the Sandbox was rapidly traversing the site's page map and directories looking for holes in the system. I also notice that all the page views were accessed by the same IP address within under 1 minute. Hmmm, I thought. "That could be a shared IP, a person or even a bot ignoring my robots.txt rules." But..as I scrolled down I further discovered there were attempts to access the .htaccess and passwd files that protect the site. I immediately clicked on the IP address data value in the detail report (in my admin version of the report) which linked me to an IP Address Blacklist look-up service. The Blacklist Look-up program informed me that the IP address has been previously reported and was listed as suspicious for malicious activity. BINGO! Goodbye whoever you are!
Wow, talk about taking action on your data huh?
It is not a question of if, but when an unwarranted attempt will occur on your systems. Make sure you take the appropriate steps to protect them by using the appropriate software and services that will make you aware of problems. My experience may be an oversimplification but it is a great example of how I used Pentaho to make me aware of a problem and take that raw data and turn it into actionable information.
Special thanks to Marc Batchelor, Chief Engineer and Co-Founder of Pentaho for helping me explore the corrective actions to take to protect the Pentaho Evaluation Sandbox.
Director of Enterprise Solutions
Monday, September 19, 2011
You have questions. How do you get your answers? The methods and the tools used to help get those answers to business questions will vary per organization. For those without established BI solutions; using desktop database query and spreadsheet tools are...all too common. And...If there is a BI tool in place, usage and its longevity are dependent on its capabilities, costs to maintain it and ease of use for both development staff and business users. Decreased BI tool adoption, due to rising costs, lack of functionality and complexity may increase dependencies on technical resources and other home grown solutions to get answers. IT departments have numerous responsibilities. Running queries and creating reports may be ancillary, which can result in information not getting out in a timely manner, questions going unanswered and decisions being delayed. Therefore, the organization may not be leveraging its BI investment for what it was originally designed to do...empower business user to create actionable information.
(Read the similar experiences of Pentaho customer Kiva.org here at Marketwire: http://www.sys-con.com/node/1971384)
Six of One, Half a Dozen of the Other
The BI market is saturated with BI tools, from the well known proprietary vendors to the established commercial open source leaders and niche players. There are choices that include the "Cloud", on premise, hosted (SaaS) and even embedded. Let's face it and not complicate things...most, if not all, of the BI tools out there can do the same thing in some form or fashion. They are designed to access, optimize and visualize data that will aid in the answering of questions and tracking of business performance. Dashboards, Reporting and Analysis fall under a category I refer as "Content Delivery". These methods of delivering information are the foundation of a typical BI solution. They provide the most common means for tracking performance and identifying problems that need attention. But..did you know, there is usually some sort of prep work to be done, before that chart or traffic light is displayed on your screen or printed in that report. That prep work can range from simple ETL scripting to provisioning more robust Data Warehouse and Metadata Repositories.
Content Delivery should begin first with some sort of Data Integration. In my 15 years in the BI space I have not seen one customer or prospect challenge me on this. They all have "data" in multiple silos. They all have a "need" to access it, consolidate it, extrapolate it and make it available for analysis and reporting applications. Whether they use it already as second-hand data, loaded into an Enterprise Data Warehouse for historical purposes, or produce Operational Data Stores, they are using Data Integration. Whether they are writing code to access and move the data, using a proprietary utility or even some ETL tool, they are using Data Integration. It is important to realize that not all data needs to be "optimized" out of the gate, as it is not only the data that is important. It is how it will be used in the day to day activities supporting the questions that will be asked. This requires careful planning and consideration of the overall objectives that the BI tools will be supporting.
Well, How do I know what tools to use? - Stay Tuned
With so many tools available, how will you know what is right for the organization? Thorough investigation of the tools through RFIs, RFPs, self evaluation and POCs are a good start. However, make sure you are selecting tools based on the ability to solve your specific current AND future needs and not solely because it looks cool and provides only the "sex and sizzle" the executives are after. The typical need is always Reporting, Analysis, Dashboards. Little realize that there is a lot more to it than those three little words. In the next part of this article I will cover a few of the most common "BI Profiles" that are in almost every organization. In each profile I will cover the Pains, Symptoms and Impacts that plague organizations today as well as the solution strategies and limitations you should be aware of when looking at Pentaho.
Director of Enterprise Solutions
Tuesday, September 13, 2011
- Gather needs and requirements
- Take 1 Pentaho Installation
- Add your data
- Add Training
- Can substitute: Pentaho Sales Engineering, Consulting or a Pentaho Certified Network Partner
- Prepare a Scope of Work
- Communicate Effectively
- Execute Accordingly
- Sit back and enjoy your lower TCO
Thursday, July 7, 2011
Recently, I have been asked about Pentaho's product interaction with social network providers such as Twitter and Facebook. The data stored deep within these "social graphs" can provide its owners with critical metrics around their content. By analyzing trends within user growth and demographics, and consumption and creation of content, owners and developers are better equipped to improve their business with Facebook and Twitter. Social networking data can be viewed and analyzed utilizing existing tools such as FB Insights or even purchasable 3rd party software packages created for this specific purpose. Now...Pentaho Data Integration in its traditional sense is an ETL (Extract Transform Load) tool. It can be used to extract and extrapolate data from these services and merge or consolidate it with other relative company data. However, it can also be used to automatically push information about a company's product or service to the social network platforms. You see this in action if you have ever used Facebook and "Liked" something. At regular intervals, you will note unsolicited product offers and advertisements posted to your wall or news feed from those companies. A great way to get the word out.
The Facebook Graph API
Both Facebook and Twitter provide a number of APIs, one worth mentioning is the Facebook Graph API (don't worry Twitter, I'll get back to you in my next blog entry).
The Graph API is a RESTful service that returns a JSON response. Simply stated an HTTP request can initiate a connection with the FB systems and publish / return data that can then be parsed with a programming language or even better yet - without programing using Pentaho Data Integration and its JSON input step.
Since the FB Graph API provides both data access and publish capabilities across a number of objects (photos, events, statuses, people pages) supported in the FB Social graph, once can leverage both automated push and pull capabilities.
Tutorial: Publishing content to a Facebook Wall Using Pentaho Data Integration
The following is an example of a reference implementation to walk you through the steps to be able to have Pentaho Data Integration automatically post content to a FB Wall.
It is broken down into the following steps:
- Create a new FB Account
- Create a new unique FB user name
- Create a new FB application
- Obtain permanent OAUTH access token
- Create PDI transformation
Step 1: Created a new FB account
Step 2: Follow Instructions to setup your unique username
Add your own - or accept the defaults.
Step 3: Create a FB Application
Allow "Developer" access to your basic information.
After you allow access to the Developer App - go back here: https://www.facebook.com/developers/createapp.php if it does not redirect you.
Click Web Site
Note your application ID and Application Secret
Application ID: xxxxxxxxxxxxxxx
Application Secret: yyyyyyyyyyyyyyyyyyyyy
Enter your Site URL and Site Domain, this can be pretty much anything, but attempt to use your real information if available.
Note Settings, App ID, API Key and App Secret
Note: From here you can follow the link below for a detail tutorial on setting up permanent OAUTH access:
Below summarizes those steps:
Step 4: Obtain Permanent OAUTH Access Token:
Create and execute the below URL in your browser: Modify the below URL to use your client_id and redirect_uri - see notes in blog post link above set permission values accordingly. (http://developers.facebook.com/docs/authentication/permissions/)
Your client_id is your App ID and the redirect_uri can be anything.
You will get the following screen - yours might be different depending on what permissions you selected - make sure at least that "Post to my Wall" is there.
If not verify your permissions based of off the permission link in the blog post.
Now note the URL that was created in the browser address bar and that you were redirected to your page that you placed in the redirect_url.
You need the code value.
The code parameter will be a very lengthy string of random characters. Copy this value and hang on to it for the construction of a new URL.
This URL will turn the generated code into a valid access token for your application.
Sample of what is returned:
Now Create the Following:
Fill in your application ID, application secret, redirect uri, and the code we just copied. Again, ours looks like this:
You will get back an access token:
Now you should be able to use PDI and the HTTP POST step using the various FB GRAPH APIs to do things: http://developers.facebook.com/docs/reference/api/ such as posting content to the FB wall / news feed and etc.
Step 5: Created a PDI Transformation using the HTTP POST step and the FB Graph API with /PROFILE_ID/feed
- Create a new Transformation
- Use a Generate Rows Step (found under Input) to set the various Facebook parameter names that can be found here
- Make sure to use the access_token parameter and value you got from the steps above
- Add HTTP Post step (found under Lookup) and connect hop from Generate Rows
- Configure the HTTP Post step to use the feed RESTful service https://graph.facebook.com/mpentaho/feed
Refer to http://developers.facebook.com/docs/reference/api/ Publishing section for list of methods
Replace mpentaho with your unique user name you set up earlier
- Jump to the Fields tab and click "Get Fields" under the "Query parameter" panel
- Click OK, Save and right click on the HTTP Post Step and select Preview, then Quick Launch
- In a few seconds a panel should come up displaying your data
- Check the result column (at the end) and look for a return code such as:
- Check your newly created Facebook account wall and you should see
- If not check your FB account security and application privacy settings to ensure the application has access.
Director of Enterprise Solutions
Monday, June 27, 2011
Report bursting is the process of sending personalized formatted results derived from one or more queries to multiple destinations. Destinations can be file systems, email distribution lists, network printers or even FTP hosts. Allowing a greater method of distribution. Usually, the end result will display information pertinent to the recipient or location; therefore each recipient only sees their own data. Below is a brief example of how Pentaho Report Bursting can be achieved with Pentaho Data Integration 4.2. By leveraging Pentaho Data Integration's new Pentaho Reporting Output step, once can create a simple tasks that executes and renders multiple reports from a single Pentaho Report template. This is a truly powerful example of how Pentaho Data Integration can be used for more than just ETL.
Special thanks to Wayne Johnson, Senior Sales Engineer for providing the sample and setup document.
How To document and sample here
Tuesday, May 31, 2011
Before you Begin
The following tutorial should be used to setup a simple reference implementation of the Pentaho BI Server configured with LDAP authentication. The prerequisites needed in order to be successful with this tutorial include an existing installation and usage of the Pentaho BI Server and Enterprise Console, a simple understanding of LDAP and the ability to follow standard installation procedures using install wizards. The tutorial is represented from a Windows operating system perspective, but is applicable across multiple platforms. It is recommended that you get the reference implementation working successfully before configuring your Pentaho BI Server to use your own LDAP configuration.The rest can be found here at the Pentaho Evaluation Sandbox.
Saturday, May 21, 2011
Pentaho Experience Level: Medium to Advanced
Spatial or also known as Geographical Reporting, is a great way to answer the question: "Where are my....(fill in the blank here)?" It is a great way to visualize the spatial or location component of your data (Latitude, Longitude, Country, County, Region, City, State, Zip Code etc). It can also tell you where the lowest or highest concentration of a desired metric may lie with the use of color gradients or conditionally styled points. The ability to drill in even deeper, allows you to eliminate the surrounding areas and focus your attention on the areas that may need it most. The Pentaho BI Platform can take advantage of 3rd party visualization solutions such as the Google Maps API and integrate it as a component that can be used with the Pentaho User Console.
Read more and come see and example in action here: http://sandbox.pentaho.com/samples-and-examples/samples-and-examples/dynamic-google-maps-widget/
Director of Sales Engineering
Monday, May 9, 2011
View the Techcast here.
Download document AND sample .ktr and .prpt files here.
To use the sample:
- Unzip *.zip file
- Copy Files to temporary folder
- Use Report Designer and Open the PRPT file (the .ktr is already embedded in it)
- Publish to Pentaho User Console as you would with any other Pentaho Report
- Optional: Import .ktr file into PDI to see the simple transformation
Senior Sales Engineer
Originally posted on the Pentaho Evaluation Sandbox:
Tuesday, March 29, 2011
“Experts often possess more data than judgment.” - Colin Powell. Perhaps because they did not have a highly scalable Business Intelligence solution in place to assist them with their judgment. :-)
Data is everywhere! The amount of data being collected by organizations today is experiencing explosive growth. In general, ETL (Extract Transform Load) tools have been designed to move, cleanse, integrate, normalize and enrich raw data to make it meaningful and available for potential decision makers. Once data has been "optimized", it can then be turned into "actionable" information using the appropriate business applications or Business Intelligence software. Significant information could then be used to discover how to increase profits, reduce costs or even suggest what your next movie rental on Netflix should be. The ability to pre-process this raw-data before making it available to the masses, becomes increasingly vital to organizations that must collect, merge and create a centralized repository containing "one version of the truth". Having an ETL solution that is always available, extensible, flexible and highly scalable is an integral part of processing this data.
Read more here at the Pentaho Evaluation Sandbox:
Senior Director of Sales Engineering
Saturday, January 22, 2011
Read more about it here and watch the tutorial and download the sample:
Wednesday, January 12, 2011
Read more here: http://sandbox.pentaho.com/2011/01/pentaho-reporting-and-pentaho-analysis/
Thursday, January 6, 2011
"Crazy" as in crazy busy. I'm sure you have heard the phrase before, but then again it depends on what industry you are in. A down economy has certainly not affected the Commercial Open Source space, I can tell you that.
To add to all the excitement, on January 19th and 20th is our Global Partner Summit in San Francisco at the Presido Golden Gate Club.
CTOs, architects, product managers, business executives and partner-facing staff from System Integrators and Resellers should attend this event. You can register and find out more here: Global Partner Summit
There will be technology tracks, business tracks, Q&A discussion panels and more for all to take part in. This year I am honored to join the team to present a couple of topics that surely should not be missed.
Sales Engineering will be holding sessions that will show you how you can brand and customize the default Pentaho User Console. I will also present how adding "Guided Ad hoc" to your applications can provide business value to those who are not so accepting of the out-of-the-box tools.
You can view the full agenda here
I look forward to speaking with many of you as well as, once again ,visiting my home away from home...San Francisco.
See you there.
Director of Sales Engineering
Wednesday, January 5, 2011
When I was with a proprietary BI vendor (before the explosive disruptive model of Commercial Open Source BI), I spent 3 weeks on site with a prospect conducting a POC (Proof of Concept). It was well received for both its data integration and information delivery functionality. However, even though we had specific data integration capabilities that surpassed the competition, we still lost because the business users liked the competition's "Prettier Dashboards". The first thing out of the IT Director's mouth was... "Well, we went with
More recently, inspired by a colleague of mine, Gabriel Fuchs and his web post Data Visualization – Cool is Not a Key Driver! - I am still overwhelmingly surprised how much emphasis organizations put on the importance of "having nice looking dashboards" without really knowing what is involved under the covers. Further more they have a tendency to not know what charts or visualizations should "go" with "what" data. (You'll be surprised at how many simple line charts are used incorrectly or when to use or not use a pie chart) I have heard so many colorful descriptions I had to wonder if they really understood the business value behind a BI solution at all. From dashboards that are "Nice and Friendly" to those that are "Fancy, Sexy, Sizzle and In your face". At times I was wondering if they were describing their ideal mate or the latest and greatest automobile.
All too often, IT or the occasional business user will start researching BI solutions and stumble upon a software package that appears to do what they need. Perhaps they were able to get a "Fancy" dashboard up and running quickly. Soon they may find that the proposed solution is either too costly, not scalable, only runs on Windows, cannot access all their data easily or perhaps only provides dashboards and lacks other critical BI functionality. They may have been initially captivated by the Siren's music but soon realize that the "Fancy" dashboard was just skin deep. 1 out of every 10 calls that I am on reveals that the prospects are only looking for just dashboards. When further discovery takes place, it is also learned that the "dashboard only" deployment is usually for just a few users and localized departmental data, not exactly an Enterprise wide solution. The rest of the prospective calls are looking for Dashboards as well as Reporting, Analysis and more often than not, Data Integration. I mention Data Integration as well because these organizations have disparate data sources on many different platforms. They are looking to easily access, optimize and visualize this data that will be able to answer today's questions as well as tomorrow's questions, perhaps across the entire data set - not just a small slice.
Here are some important facts to remember:
- Most business users do not understand the value of BI
- It is important to show how BI can help knowledge workers do a better job
- IT cannot just throw a BI application at the wall of business users and hope it sticks
- BI is NOT a technology tool
- BI involves specific business processes
- BI applications can both drive revenue growth and can also reduce costs to optimize profits
- Do NOT assume that Subject Matter experts understand BI and its potential
- What is your definition of a successful evaluation?
- What data is needed in order to…..?
- Where is the data I need in order to….?
- How easily can I access all that data?
- Do I have the proper skill sets to deploy a BI Application?
- Do I want my business users to ask Ad hoc questions?
- What questions do I or my business users want to ask of the data?
- Do I need Operational reporting including schedule and distribution?
- What have I found from my existing BI application(s)?
- What actions do I want to take from my findings?