Capturing component throughput


[microsoft follow-up]

 

i've been reading this thread guy asking capturing throughput of dataflow. suggested there no real notion of capturing throughput of dataflow believe there notion of capturing throughput of component or execution tree.

 

i believe information 1 need capture throughput of component (apart name of component is) available in onpipelinerowssent event. if there onpipelinerowssent eventhandler , onpipelinerowssent event contained name of component think able capture throughput of component. so, questions:

  1. why there no eventhandler onpipelinerowssent event?
  2. can name of component added information in onpipelinerowssent event?

 

following on this... once had conversation here kirk haselden capturing pipeline throughput. thought idea , suggested raised dcr did in old pre-connect days , seems though dcr (like many other things) didn't make across connect. so, more questions:

  1. can find connect dcrs relating capturing throughput? i've found this: https://connect.microsoft.com/sqlserver/feedback/viewfeedback.aspx?feedbackid=152162 that raised 18 months ago hasn't had single comment @ microsoft.
  2. do think capturing throughput useful? can foresee huge advantages capturing in debugger. (note informatica , has done years. has nice gui shows throughput of each destination in mapplet.)

i'd welcome thoughts around this. big ask , fits in nicely constant, nay incessent, requests debugging enhancements maybe 1 darvey have read of???

 

thanks

jamie

 

hiya,

 

there’s been bit of activity on topic, , on dcr in particular, unfortunately it’s not visible through connect.

 

we have added debugging improvements should appear in next ctp, , deal calculating row throughput. it’s not ideal (i.e. there’s more work done), seems starting point determining bottlenecks.

 

note, stuff hasn’t been released yet , change ctp / final release.

 

there new pipelinecomponenttime log event shows how time spent on each component after execution complete.

 

example:

 

the component "no cache lookup" (291) spent 57870 milliseconds in processinput.

the component "full cache lookup" (316) spent 125 milliseconds in processinput.

 

we output amount of time source has spent suspended give better idea of actual processing time.

 

example:

 

during last execution pipeline suspended output "ole db source output" (11) of component "ole db source" (1) 57268 milliseconds limit number of in-memory buffers.

the component "ole db source" (1) spent 62938 milliseconds in primeoutput filling buffers on output "ole db source output" (11).

 

to answer other questions –

 

event handlers control flow / runtime thing – i’m not sure if our architecture allows pipeline components define own event handlers. open separate dcr, however. first impression event handler in pipeline, event fired onpipelinerowssent, skew performance performance results. you’d practically running new control flow on every event. onpipelinerowssent current logging event, best way capture custom log provider.

 

for component name – isn’t in output? when log event see:

 

“rows provided data flow component input. :  : 310 : ole db source output : 291 : lookup : 292 : lookup input : 230”

 

the format after message is:

 

: : <src id> : <src name> : <dst id> : <dst name> :  <dst input id> : <input name> : <# rows sent>

 

do think these logging messages useful?

 

on side note, know should blogging stuff more! i’ll have time turn blog post, , include details of other supportability improvements.

 

~matt

 



SQL Server  >  SQL Server Integration Services



Comments

Popular posts from this blog

Conditional formatting a graph vertical axis in SSRS 2012 charts

Register with Power BI failed

SQL server replication error Cannot find the dbo or user defined function........