topics/ADR/gn3/000-remove-stace-traces-in-gn3-error-response.gmi


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

# [ADR-001/gn3] Remove Stack Traces in GN3

* author: bonfacem
* status: rejected
* reviewed-by: jnduli, zach, pjotr, fredm

## Context

Right now, we have stack-traces embedded in our GN3 error response:

```
def add_trace(exc: Exception, jsonmsg: dict) -> dict:
    """Add the traceback to the error handling object."""
    return {
        **jsonmsg,
        "error-trace": "".join(traceback.format_exception(exc))
    }


def page_not_found(pnf):
    """Generic 404 handler."""
    current_app.logger.error("Handling 404 errors", exc_info=True)
    return jsonify(add_trace(pnf, {
        "error": pnf.name,
        "error_description": pnf.description
    })), 404


def internal_server_error(pnf):
    """Generic 404 handler."""
    current_app.logger.error("Handling internal server errors", exc_info=True)
    return jsonify(add_trace(pnf, {
        "error": pnf.name,
        "error_description": pnf.description
    })), 500
```


## Decision

Stack traces have the potential to allow malicious actors compromise our system by providing more context.  As such, we should send a useful description of what went wrong; and log our stack traces in our logs, and send an appropriate error status code.  We can use the logs to troubleshoot our system.

## Consequences

* Lockstep update in GN2 UI on how we handle GN3 errors.

## Notes

ADR rejected.  Currently, having stack traces are a convenient feature for situations where bugs are being reported to us by others.  It's not always easy to reproduce the issue in question and check logs (since they wouldn't show up in production and would need to de reproduced locally); therefore having stack traces available in such situations can be very useful.  To also get rid of the stack traces, then we'd have to link each trace in the logs with the request that caused it, so during troubleshooting, we can correlate and endpoint to an error and it's trace.