-
Notifications
You must be signed in to change notification settings - Fork 205
refactor!: record sessions as log events #1000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| private var configuration: SessionConfig | ||
| private var session: Session? | ||
| private var lock = NSLock() | ||
| private var _session: Session? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your issue here wasn't with the type of lock you're using, but with reentrency. In your previous code you would lock, then do a bunch of work, and send out notifications then unlock. This means you hold the lock while calling out to external code. If that code calls back into the code that locks, then you have a deadlock.
You've fixed it here by only locking session access, basically making the session get/set atomic which is really all you need since it's immutable.
I'd suggest changing back to locks (event an unfair lock if possible) but keeping the locking on the session only, or in areas that need to be atomically in sync, but do not call out to external code while you hold the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I just used queue for convenience but lock is definitely superior when done carefully. I put out a revision, thanks alex
beekhc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look at this from a semantic conventions perspective, and I think it's a good change. Made a couple of small comments.
| /// https://opentelemetry.io/docs/specs/semconv/general/session/ | ||
| logger.logRecordBuilder() | ||
| .setBody(AttributeValue.string(SessionConstants.sessionStartEvent)) | ||
| .setEventName(SessionConstants.sessionStartEvent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that this change makes the Swift behavior more consistent with the Android behavior.
| /// Event name for session end events | ||
| public static let sessionEndEvent = "session.end" | ||
| /// Attribute name for session identifier | ||
| public static let id = "session.id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use the Semantic Conventions constants for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in a8e9374
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i couldn't find the event names though, maybe we are missing some structs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be accessible here: https://github.com/open-telemetry/opentelemetry-swift-core/blob/main/Sources/OpenTelemetryApi/Common/SemanticAttributes/Attributes/Session_attributes.swift#L23
misunderstood the discussion; we may need to re-generate the semantic attributes with addition flags to include development items. I can follow up on this.
Sources/Instrumentation/Sessions/SessionEventInstrumentation.swift
Outdated
Show resolved
Hide resolved
| refreshSession() | ||
| let transition : SessionTransition? = lock.withLock { | ||
| if session == nil || session!.isExpired() { | ||
| return startSession() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calling out here can be really dangerous since you hold the lock. Sometimes you can do it when you have a naming convention like locked_startSession so everyone knows that you're already lock in there. The biggest fear is that startSession calls out in some fashion. If this isn't the case then ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same with refreshSession below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored so the lock is only used in bits of getSession, peekSession, and restoreSessionFromDisk. Do you still recommend locked_ prefix for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i understand now, added the prefix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this file
a8e9374 to
325829c
Compare
Breaking
I refactoring session events to be recorded as log event, as opposed to regular log records. In addition, I am removing
start_timeandend_timefrom attributes, since they are not needed (discussed in open-telemetry/semantic-conventions#2770).Summary
Since they are session "events", I have refactored session to be log events. https://opentelemetry.io/docs/specs/semconv/general/session/
I have also improved the thread safety in session instrumentation by migrating from locks to queues, which I found to be much less susceptible to lock contention. This will also help with the migration in #988
Logs
Session Start
{ "timeUnixNano": "1764721305052582144", "attributes": [ { "key": "session.id", "value": { "stringValue": "2840B2AE-E4ED-41E0-B09E-3300DD715F46" } }, { "key": "session.previous_id", "value": { "stringValue": "C2A8C842-36B3-4CBB-90FE-5FDC0E6A4CF6" } } ], "observedTimeUnixNano": "1764721305052721920", "eventName": "session.start" }Session End
{ "timeUnixNano": "1764721293493801984", "attributes": [ { "key": "session.id", "value": { "stringValue": "C2A8C842-36B3-4CBB-90FE-5FDC0E6A4CF6" } }, { "key": "session.previous_id", "value": { "stringValue": "3D787EDB-E3F1-4E8E-8216-912B5DDEAE9D" } }, { "key": "session.duration", "value": { "doubleValue": 127912 } } ], "observedTimeUnixNano": "1764721305052654080", "eventName": "session.end" }Testing
Refactored the aws-otel-swift and AwsHackerNewsApp to use otel session instrumentation here - aws-observability/aws-otel-swift@main...williazz:aws-otel-swift:dev/session-events-refactor
Visualization in local AwsOtelUI framework