RIF To-do List

Contents

Java Middleware

  1. DataStorageLayer is split into ms and pg. All web services have either /ms or /pg in the URL to signify which database is being used. This needs to be refactored into a common super class for both databases and dispense of the two separate lumps of code for each database type. The actual differences between ms and pg are not very big. The RIF works the way things are now, but there is a lot of duplication of code and it will cause a huge maintenance problem in the future.

    Done MM.

  2. The R script uses ODBC. Now that JRI is used this can be changed to JDBC to allow Linux version of RIF. The JRI code has now been isolated into RIF_odbc.R prior to conversion.

    Done MM. Uses JDBC on Postgres, ODBC on SQL Server

  3. Improved logging. [PH done partially September 2017]; including correct error recovery tracing. Note there are issues with log4j log rotation.

    Some improvements to log rotation by using one log file per service. Added front end logger at the same time. Logging no longer “hangs” at the end of the day (the log was only written if you shut down Tomcat nicely); but log rotation and the delay to write are still problems. Any solution

  4. Still A LOT of redundant, dead-end, stubbed, duplicate or unused code resulting from the lack of initial scoping as to what the RIF was going to do.

  5. Risk Analysis: Done, but more work needed on maps (do not include selection shapes)

  6. Data Extract ZIP file. PH completed initial middleware support.

  7. Rengine not being shutdown correctly on reload of service:

    Cannot find JRI native library!
    Please make sure that the JRI native library is in a directory listed in java.library.path.
    
    java.lang.UnsatisfiedLinkError: Native Library C:\Program Files\R\R-3.4.0\library\rJava\jri\x64\jri.dll already loaded in another classloader
        at java.lang.ClassLoader.loadLibrary0(Unknown Source)
        at java.lang.ClassLoader.loadLibrary(Unknown Source)
        at java.lang.Runtime.loadLibrary0(Unknown Source)
        at java.lang.System.loadLibrary(Unknown Source)
        at org.rosuda.JRI.Rengine.<clinit>(Rengine.java:19)
        at rifServices.dataStorageLayer.pg.PGSQLSmoothResultsSubmissionStep.performStep(PGSQLSmoothResultsSubmissionStep.java:183)
        at rifServices.dataStorageLayer.pg.PGSQLRunStudyThread.smoothResults(PGSQLRunStudyThread.java:257)
        at rifServices.dataStorageLayer.pg.PGSQLRunStudyThread.run(PGSQLRunStudyThread.java:176)
        at java.lang.Thread.run(Unknown Source)
        at rifServices.dataStorageLayer.pg.PGSQLAbstractRIFStudySubmissionService.submitStudy(PGSQLAbstractRIFStudySubmissionService
    

    The solution is to restart tomcat. Server reload needs to stop R. This requires a @WebListener [Context Listener (javax.servlet.ServletContextListener)].

JavaScript

  1. Code mostly works - may need some tidying in places. Possible refactor the submission mapping tools (rifd-dsub-maptable) to fit in with the Leaflet stuff used in disease mapping and data viewer as there is a lot of duplication. It works fine as it is though, just a maintenance issue. Especially: rifp-dsub-maptable.html, rifs-util-mapping.js

  2. Export map to png functionality. [Done PH using Java]

  3. Save rifSubmission to text file (.JSON). This is currently done with a directive in JS and as such is a bit temperamental because of various browser/security issues. We need a new middleware method to save the rifSubmission.txt as a .json file [see above]. [Done PH - implemented using middleware]. Files are now in JSON5 format to make fore readable

  4. Some of the references to parent and child scopes are messy and non-angular and may need looking at. But it does work. Generally best replaced by services (e.g. the AlertService to access the AlertController)

  5. Main CSS needs removal of redundant code

  6. At some point, Leaflet version used will need to be updated to v1.2.x. Breaking changes to the RIF expected.

  7. New D3 output graphs as-and-when requested by users. It is likely that when risk analysis is done, new graphs and/or tables will be needed.

  8. A download link is required to download the actual ZIP file [this requires a Middleware method too!] [Done PH. Is intelligent]. May need a [redo] download button

  9. When you change the geography the numerator and denominator do not change and need to be changed manually; [Done; bug fix]

  10. Login initialisation errors if a) you shoot tomcat whilst logged on [the RIF must be reloaded] and b) spurious complaints caused by the process of logging out; e.g.
    ERROR: API method "isLoggedIn" has a null "userID" parameter.
    ERROR: Record "User" field "User ID" cannot be empty.
    

    Normally reloading the RIF allows the user to logon again; although if the user is already logged on the Middleware will not let the user log on a second time; [Done: 29/3/2018 PH]

  11. The newest study completed when the RIF initialised is displayed, this does not change with even when the user goes to the tab for the first time;

  12. Add save study/comparison bands to file. Upload from file must have fields named ID,Band and can have other fields (e.g. NAME). Names are restricted and a save to file option would be good. File: rifd-dsub-maptable.js; [Done PH 3/9/2018]

  13. Map synchronisation issues: [First set done: PH 18/12/2017]

    Second set:

  14. The map hover displays the area_id property and should also display the name property if it is available [Done: 7/11/2017]. See also issue #65;

  15. Null zoomlevel error, appears when moving between the data viewer and the disease mapper. Made much more likely by changing from one geography to another! [Partially done: PH 18/12/2017]
    11:58:59.708 XML Parsing Error: no element found
    Location: https://localhost:8080/rifServices/studyResultRetrieval/ms/getTileMakerTiles?userID=peter&geographyName=USA_2014&geoLevelSelectName=CB_2014_US_COUNTY_500K&zoomlevel=null&x=1&y=0
    Line Number 1, Column 1: 1 getTileMakerTiles:1:1
    

    Appears to stop the “zoom” to map and then to study extent;

  16. Memory leaks [Done: PH 18/12/2017]

BEWARE: THIS ISSUE COULD RETURN: ALWAYS TEST CHANGES TO ANY MAP CODE FOR LEAKS

Database

Missing information not stored in the database

  1. Retrieve information on a completed study. Used in the info button in disease mapping and data viewer. The database cannot return all the required information. This requires changes to both the backend and middleware. [Done PH]

  2. I’m not sure the statistical method is being stored in the database correctly, that is it is always NONE. [Done PH]

Postgres Port

  1. The Postgres port is slow. This is because by the logging function rif40_log() is not compiled in Postgres and therefore slow. Needs to either be made acceptablely fast or the debug messages commented out. This particularily effects triggers. The Postgres SEER data is also extracting 3x slower than SQL Server; this requires further analysis; [Done: 13/11/2017; room for improvement]
  2. Data loading scripts needs to be made make independent - i.e. run from a single script like the SQL server ones, with one file/object;
  3. Patches need to be merged.

SEER test dataset

  1. A large scale test dataset of real data is required for testing. The US County level SEER data was selected. [Done: both ports]

    The SEER cancer data has 9,176,963 rows and requires 800MB for the data and 1.3GB for the indexes.

    The test study 1004 SEER 2000-13 lung cancer HH income mainland states.json:

    Data is extracted SQL server in 35s and R INLA in 40s.

Security Testing [Done: PH 24/11/2017]

205 unique URLs were tested using OWASP ZAP (https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy_Project) Ajax Spider

The report is at: (/rapidInquiryFacility/development/owasp_zap_test1.md) and the URL list tested is at: (/rapidInquiryFacility/development/url_list2.txt).

One medium and three low medium isses were highlighted for fixing.

Medium Issues

  1. X-Frame-Options Header Not Set

    X-Frame-Options header is not included in the HTTP response to protect against ‘ClickJacking’ attacks.

Solution

References

Low Medium Issues

  1. Incomplete or No Cache-control and Pragma HTTP Header Set

    The cache-control and pragma HTTP header have not been set properly or are missing allowing the browser and proxies to cache content.

Solution

Reference

  1. X-Content-Type-Options Header Missing

    The Anti-MIME-Sniffing header X-Content-Type-Options was not set to ‘nosniff’. This allows older versions of Internet Explorer and Chrome to perform MIME-sniffing on the response body, potentially causing the response body to be interpreted and displayed as a content type other than the declared content type. Current (early 2014) and legacy versions of Firefox will use the declared content type (if one is set), rather than performing MIME-sniffing.

Solution

Other information

References

  1. Web Browser XSS Protection Not Enabled

    Web Browser XSS Protection is not enabled, or is disabled by the configuration of the ‘X-XSS-Protection’ HTTP response header on the web server

Solution

Other information

References

TileMaker and TileViewer

TileMaker is currently working with some minor faults but needs to:

  1. Support for geogrpahic centroids [Done];
  2. Run the generated scripts. This requires the ability to logon and PSQL copy needs to be replaced to SQL COPY from STDIN/to STDOUT with STDIN/STOUT file handlers in Node.js;
  3. UTF8/16 support (e.g. Slättåkra-Kvibille should not be mangled as at present);
  4. Support very large shapefiles (e.g. COA2011) [Done];
  5. Needs a manual [Done]!
  6. GUI’s needs to be merged and brought up to same standard as the rest of the RIF. The TileViewer screen is in better shape than the TileMaker screen. Probably the best solution is to use Angular;
  7. Support for database logons;
  8. Needs to calculate geographic centroids using the database.

Information Governance Tool

Needs to be specified.

Data Loading Tool

Needs to be discussed with CDC.

TODO

Milestones

PH Final Deliverables

These are to end of contract 10th October 2018

  1. Laymen installation manual and technical manual for setting up RIF software installation on a new machine by any of the members of the RIF core team (i.e. Martin, Brandon and Fred) – 10th May 2018

    (Instruction to be cross checked on ICL on site computers by SAHSU team before sign off)

  2. Layman method for adding new user login. For both non-secured and secure environment ie private network. Provided as part of the database administrators manual. To be provided in SQL and requires database administration privileges. Automation to be a part of the future information governance tool – May 2018

  3. UK geographies to be added and protocol for adding new geographies to the RIF to be available. Clear documented methods for adding new geographies. – May-June 2018

  4. Straightforward way of adding data to the RIF database. Supply data formatting instructions to the SAHSU data team lead. – June 2018

  5. Disease mapping and risk analysis: the same functionalities as originally included in the RIF 3.2 (except multiple investigations) should be included and working. Having additional functionalities (e.g. SatSCAN) would be good, but if not possible, clear guidance on how to add functionalities should be available. All the functionalities available in the beta-version should have been tested by the end of the contract period (with the support of the RIF team, including Aina and Fred). While Brandon and Martin will have key roles in the development of the risk analysis functionalities, you will oversee the integration into the RIF of the different pieces developed by each of them. – July 2018

  6. ICD codes. Provide protocol for application notes. Will add support to the RIF for using ICD9, 10 and 11 simultaneously if time permits – Aug 2018

  7. Confounders. Sex and age defaults. Protocol on how to add and test other cofounders to be made available for the SAHSU team to be able to manage. such as socio-demographic status (Carstairs or IMD), ethnicity or smoking. – Aug 2018

  8. TileMaker and TileViewer: Details of coding and troubleshooting suggestions provided and possible bug fixes. Both to be tested on differing geographies before handover. – Apr-Sep 2018

  9. A complete manual describing the functionalities of the RIF and their use, to be developed with other members of the RIF team, including Brandon, Martin, Aina and Fred. To be kept current. – Apr-Sep 2018

  10. Fully up to date Github repository, with clear annotations and explanations. To be kept current – Apr-Sep 2018

Chart

Who April to May 2018 June to July 2018 August to September 2018
Peter Hambly Build SAHSU production system, UK 2011 geography test SAHSU production system RIF Handover to Martin McCallion
  Manuals: system manager, data loader, revise user Remaining database related functionality Handover SAHSU production system to Hima Daby
Martin McCallion Risk analysis Complete risk analysis Data loader
Brandon Parkes RIF results field renaming specification    
  Statistical script for processing risk analysis    

April to May 2018

Dependencies

Middleware

SAHSU RIF [Peter Hambly];

Database [Peter Hambly];

June to July 2018

SAHSU RIF [Peter Hambly];

Middleware

Front End [Peter Hambly]

High Priority

Low Priority

Database [Peter Hambly]

High Priority

Low Priority

Front End [Peter Hambly]

August to September 2018

RIF Handover [Peter Hambly]

Wish list

This is nice to have functionality that is on hold pending an assessment of need and/or technical feasibility

Middleware Wish list

Front End Wish list

Issues

These are issues that have been noted but do not affect the running of the RIF

Middleware Issues

Front End Issues

Database Issues

Data loader Issues

Documentation Issues

Peter Hambly May 2nd 2018