1 Reply Latest reply on Oct 25, 2013 12:26 PM by Lawrence Garvin

    Having trouble gathering total Bytes for specific Hosts

    mateikav

      After running NTA for awhile and finding that Netflix traffic hogs a lot of bandwidth, I wanted to find out just how much. In order to isolate the Netflix related rows in SQL I queried this:

       

      SELECT  TOP 10000 FlowCorrelation_Source_FlowCorrelation.FullHostname AS Full_Hostname_A,

      SUM(NetflowConversationSummary.TotalBytes) AS SUM_of_Bytes_Transferred,

      SUM(NetflowConversationSummary.TotalBytes) AS Total_Bytes

       

       

      FROM

      ((NetflowConversationSummary LEFT OUTER JOIN FlowCorrelation FlowCorrelation_Source_FlowCorrelation ON (NetflowConversationSummary.SourceIPSort = FlowCorrelation_Source_FlowCorrelation.IPAddressSort))  LEFT OUTER JOIN FlowCorrelation FlowCorrelation_Dest_FlowCorrelation ON (NetflowConversationSummary.DestIPSort = FlowCorrelation_Dest_FlowCorrelation.IPAddressSort))  INNER JOIN Nodes ON (NetflowConversationSummary.NodeID = Nodes.NodeID)

       

       

       

       

      WHERE

      ( DateTime BETWEEN 41539 AND 41570 )

      AND

      (

        (FlowCorrelation_Source_FlowCorrelation.FullHostname LIKE '%ix.nflxvideo.net%')

      )

       

       

      GROUP BY FlowCorrelation_Source_FlowCorrelation.FullHostname, FlowCorrelation_Dest_FlowCorrelation.FullHostname, Nodes.Caption, Nodes.NodeID, FlowCorrelation_Source_FlowCorrelation.IPAddress

       

      You'll see in the WHERE that I isolate Netflix by %ix.nflxvideo.net%. This is because each entry into the table has a slightly different ip address. No non-netflix related row would contain ix.nflxvideo.net.

       

      So now that I have every instance of Netflix (assumedly), I want to tally up the TotalBytes column (which now only contains netflix related rows) to get a total amount of usage. I'm not sure how to do this. Theoretically I should be able to do a SUM, but I only get the big list that is in the below screenshot:

       

      I want to tally up Sum_of_Bytes_Transferred. The Total_Bytes column is just my started attempt to accomplish the tally. So far I am stuck.

       

      Thanks in advance. I'm sorry if this is the wrong forum.

        • Re: Having trouble gathering total Bytes for specific Hosts
          Lawrence Garvin

          The query, as written, will return a TotalBytes SUM calculation for each unique combination of the five fields defined in the GROUP BY clause:

          Source_Hostname

          Dest_Hostname

          Nodes.Caption

          Nodes.NodeID

          Source_IPAddress

          but many of these fields are redundant, and unless you want/need subtotals by pathway, the GROUP BY clause is complicating things.

           

          If you actually only want one value to be returned, try this:

           

          SELECT SUM(NetflowConversationSummary.TotalBytes) AS Total_Bytes

          FROM NetflowConversationSummary

          LEFT OUTER JOIN FlowCorrelation FlowCorrelation_Source_FlowCorrelation ON (NetflowConversationSummary.SourceIPSort = FlowCorrelation_Source_FlowCorrelation.IPAddressSort)

          LEFT OUTER JOIN FlowCorrelation FlowCorrelation_Dest_FlowCorrelation ON (NetflowConversationSummary.DestIPSort = FlowCorrelation_Dest_FlowCorrelation.IPAddressSort)

          INNER JOIN Nodes ON (NetflowConversationSummary.NodeID = Nodes.NodeID)

          WHERE ( DateTime BETWEEN 41539 AND 41570 ) AND  (FlowCorrelation_Source_FlowCorrelation.FullHostname LIKE '%ix.nflxvideo.net%')

           

          Taking this a step further, if you wanted Total_Bytes by source hostname (e.g. who's using the most Netflix streaming), which I think was the actual objective, then this should meet that need:

           

          SELECT FlowCorrelation_Dest_FlowCorrelation.FullHostName AS Viewer_Hostname, SUM(NetflowConversationSummary.TotalBytes) AS Total_Bytes

          FROM NetflowConversationSummary

          LEFT OUTER JOIN FlowCorrelation FlowCorrelation_Source_FlowCorrelation ON (NetflowConversationSummary.SourceIPSort = FlowCorrelation_Source_FlowCorrelation.IPAddressSort)

          LEFT OUTER JOIN FlowCorrelation FlowCorrelation_Dest_FlowCorrelation ON (NetflowConversationSummary.DestIPSort = FlowCorrelation_Dest_FlowCorrelation.IPAddressSort)

          INNER JOIN Nodes ON (NetflowConversationSummary.NodeID = Nodes.NodeID)

          WHERE ( DateTime BETWEEN 41539 AND 41570 ) AND  (FlowCorrelation_Source_FlowCorrelation.FullHostname LIKE '%ix.nflxvideo.net%')

          GROUP BY FlowCorrelation_Dest_FlowCorrelation.FullHostName

           

          and you should get one row per source hostname for those hosts who have Netflix traffic recorded.

           

          NOTE: I did not check/validate the joins in the query logic, and if you don't need those fields in the SELECT or GROUP BY clauses, it may also be that you don't need the JOINs, either (e.g the INNER JOIN Nodes), which would only serve to return fields from the Nodes table, which are redundant from the source identifiers in the FlowCorrelation table.