NSDL security tasks for fraud detection

NSDL has asked us to implement (i) extended logging of Web access details, (ii) geo-location, and (iii) implementing user-track cookies. This is being done currently by Prashant Keshvani. In a week's time, Prashant Shendre will join it.

Meeting with Shuvam on 21 May 2009

Review of the tasks done in last one week. A task list for
next few days is as follows. Next status will be exchanged
on Monday, 25 May 2009.

  • Put the page on Intranet for previous meeting
  • Put the page on Intranet for today's meeting
  • Review Prashant's work for modified version of
    mod_dumpio for following:
    • usetrack
    • unique_id
    • IP address of the browser, if feasible
    • Modify Tomcat code to log tuple of
      combo and build in-memory hash to log information only once.
    • Value of hash key shall be the last seen time stamp for
      such combo key. This time stamp can be used later for garbage
      collection.
    • Look for Perl module to parse HTTP headers, etc.
    • Study the code of dumpio to do following:
      • log everything in a single file, preferably with file handle
        opened once in a life time of the Apache
      • log unique id
      • log request headers up to certain size
      • log request body up to certain size and for
        certain MIME types only. (eg. file attachments upload shall
        not be logged at all.
      • Log response header (only for certain MIME types) up to
        certain size
      • Do not log response body at all
      • Does DumpIO encode binary in headers and other places?
      • Check all experiments performed under mod_ssl
      • Ability to handle 8 bit data for headers and body
      • Ability to handle 8 bit data for headers and body
      • White-box code review for 8 bit cleanness of log modules
      • Check Apache architecture / post to forums if Apache
        sanitizes the i/p data for before processing request, log, it,
        etc.
      • What will happen if Unicode data is received?
      • Code review of dumpio to verify that it does not
        modify data in any way, even in corner cases.
      • Ensure that while dumping the data, none of the byte
        stored in file contains binary data. If yes, make copy
        of the string, and modify the copy to replace it with
        "X" before logging. In such case, also add extra header
        X-DUMPIO-BINARY to indicate the same.
      • For HTTP response, response code, size of response
        (even if it is dynamic content), MIME type and full HTTP
        response header have to be logged.
      • For GeoIP, study all databases mentioned in the proposal,
        and do comparative study, including features, cost, license,
        etc.
      • Take Merce V3 UI, and add new mock screens for this
        product. Before that, discuss those screens with Shuvam
        by preparing it on the paper.
      • Session ID logging Tomcat: Find out exceptional conditions
        and action to be taken for exceptional conditions -- e.g.
        session id is null. In case session id does not exists, log
        it explicitly irrespective of what is kept in the hash array.
      • Meeting with Shuvam on 12 May 2009

        This documents requires re-formatting. I will figure out
        how to use UL and LI tags within Drupal and correct it.
        -- kishan

        Following tasks have to be performed:

        • TC: Explore logging of all IN/OUT parameters using Interceptor
        • Get CRA and SPEED-e details like TIN
        • Enable Apache mod_unique_id, and check if it can be sent
          to downstream components
        • Check: NTP must be running on all servers with at least
          5 stratum and 2 ref
        • Check: Domain name setting in Apache VHost
        • Check: gethostname() must work for each server, and then
          hostname to IP address mapping must work.
        • Check: Does IP address of browser reaches downstream
          components?
        • Check: DB table size limits in DB2 Expr Ed and PostgreSQL
        • Apache Log format specification?
        • Apache mod_usertrack: Enable and test it.
        • Apache: Look for sample code to log all HTTP traffic (IN/OUT)
        • Homework for preparing presentation?

          • List of attributes we will log?
            • browser type
            • session ID
            • ip address
            • geographic information (country, city, etc.)
            • User name
            • URL
            • Form field names and values in the request
            • ... add more attributes
            • List of derived attributes
              • Less frequent countries
              • Insecure countries
              • Jumping sessions
              • Anonymizer/Open Proxy
              • ... add more attributes
              • Schema -- first draft
              • List of POC to be done
              • Drill down report
              • Presentation to NSDL

                • What we will log?
                • (D1) How the interactive UI will look?
                • What security sensitive things we will track?
                  • Jumping session
                  • Getting in to URLs within mid-session
                  • ...
                  • (D4) How our design is extensible?
                  • (D3) Where our instrumentation will be inserted?
                  • What pilots we intend to do?
                  • Project milestones
                  • (D5) Hardware infrastructure
                  • (D2) Schema: unaudited
                  • What to inform Yatin?

                    • One to one meeting after 25 May
                    • Presentation after taking feedback from one-to-one
                      meeting
                    • Drill-down report

                      A diagram have to be inserted here. Broadly it covers
                      following and drill down can happen from any link to
                      any other link for a given duration.

                      • Country
                      • Browsers
                      • Jumping browsers
                      • Jumping user
                      • Cities
                      • Anonymizer
                      • Desktop
                      • Session
                      • User
                      • Comment viewing options

                        Select your preferred way to display the comments and click "Save settings" to activate your changes.