David's Astronomy Pages
|
Notes (S622) |
Notes Main |
Home Page |
Notes (S624) |
Main aims
Equipment & Software
Highlights
Summary Plots & Logs
Observing Result (2018-08-31, S623) Observatory Operations started at 21:47, Fully-Automated Mode from 23:43 1st Job Queue started at 21:57, Shutter closed at 00:22 due to Cloud 2nd Job Queue started at 01:07. Software interruptions/restarts at 01:45 (10 mins), 02:06 (4 mins), 02:47 (5 mins) Job Queue ended 03:58. Session finished 04:01 |
(Observation Status : Green=Completed, Yellow= Partially Completed, Red= Failed) |
Night Sky Summary Plot -
2018-08-31 Top axis: Sky Brightness at Zenith (in ADU/s) Lefthand axis: Local Time (hh LT). Righthand axis: Sun Altitude (degs) |
Back to Top
AstroMain completely locked-up/hung.
Observatory Control Program (CCDApp2) completely locked-up/hung at one point during the
session. Program was restarted and continued to operate ok for remainder of
session. Event occurred whilst in process of manipulating (moving) the
CCDApp2 window. Reason unknown but is assumed to be related to thread
operations trying to manipulate components on the CCD2App2 window whilst it
was busy. Deemed to be largely fixed (Nov 2018), problem seemed to be related to running the program through Visual Studio
debugger. Running the program standalone reduced the number of freezes
considerably. Resolved further in 2023.
Resolved in due course (AstroMain)
Back to Top
The Problem
Execution of a QSO Job Queue
has been observed to occasionally freeze and not proceed until the program is
exited and restarted.
Problem begun after introducing the automated
initiation of PHD2 autoguiding. The problem has occurred 0 to 2 times within a
session. Execution of job queue was freezing after after 'Checking Coordinates...' and before 'Checking Altitude...'
. The main interface of the program
remains active, allowing the program to be closed rather than killed (unlike the
the issue above where CCDApp2 program has to be killed. Since
this is a showstopper to unattended / automated operation of the observatory,
the problem was investigated in detail.
Problem Analysis
Examination of the output lines in the Report File provides evidence of what code had been executed and what code hadn't been reached before the job queue froze.
Report File (from incident)
Checking Target ... Ok Sidereal Target (GCVS CI Cyg)
Checking Coordinates ... Ok J2000 Coords RA: 19 50 11.798, Dec: +35 41 02.99
(<< Job Queue freezes here )
Normal Report File (from later in the same session):
Checking Target ... Ok Sidereal Target (GCVS CI Cyg)<< within ImageTargetT_GotoTarget() )
Checking Airmass ... Ok Airmass 1.22Examination of output lines in the Log File provides further evidence of what code has / hasn't been executed before the job queue froze.
Log File (from incident):
22:26:10.14 | GCVS CI Cyg Try Target 5/21 | Target : GCVS CI Cyg
22:26:10.14 | GCVS CI Cyg Info SetScopeMode | Setting Scope Mode to 2
22:27:18.34 | GCVS CI Cyg Data ObjectAzAlt | Az = 184.6 Alt= 68.4 Airmass=1.08
(<< within LookupKeyTargetInfo() )
(<< Job Queue freezes here )
(<< no line output
about PHD2 App State )
(<< no line output about Setting Scope
Mode to 3 )
22:33:14.20 | Killing current CCDSoft process -3580
Normal Log Report (from later in the same session):
01:02:38.27 | GCVS CI Cyg Try Target 2/9 | Target : GCVS CI Cyg
01:02:38.52 | GCVS CI Cyg Data ObjectAzAlt | Az = 249.4 Alt= 55.3 Airmass=1.22
01:02:39.03 | Checking PHD2 App State (Result=Stopped) (<< within
Phd2.StopPHD2Guiding()
)
01:02:39.03 | GCVS CI Cyg Info SetScopeMode | Setting Scope Mode to
3 (<< within SetScopeMode() )
01:02:39.48 | GCVS CI Cyg Data Object Coords (2000) | RA= 19 50 11.798 Dec= +35
41 02.99 | Epoch 2000
01:02:39.48 | GCVS CI Cyg Data Object Coords | RA= 19
50 53.988 Dec= +35 44 11.27 | Epoch Current
Relevant Code:
ReportComment(LAlign2("Ok", 10) + LPad18("J2000
Coords") + "RA: " + GetFullRA(T.Ra2000) + ", " + "Dec: " + GetFullDEC(T.Dec2000))
(<< Job Queue freeze occurs after here)
'Stop GuideScope Guiding
(if we will be slewing)
' ----------------------
If
(objConsole.checkboxAutomatedPhd2Guiding.Checked Or Phd2.Active) And T.Delay <>
cdNoSlew Then
Phd2.StopPHD2Guiding()
End
If
SetScopeMode(ScopeMode.Slewing) (<< Job Queue
Freeze must occur before here)
ImageTargetT_GotoTarget(T, bExitSub)
(<< Job Queue Freeze must occur before here)
From this analysis it would seem that the freeze occurs within "Phd2.StopPHD2Guiding()" routine, and more specifically with the line PHD2.SendRequest("{""method"": ""get_app_state"", ""id"": #ID}", response, "result", Result).
PHD2.SendRequest and the subroutines that it
calls (PHD2.GetMessage) make calls to Phd2ServerStream.Write &
Phd2ServerStream.Read, where Phd2ServerStream is a NetworkStream
opened by a prior call to Phd2Socket.GetStream within PHD2.ConnectPHD2.
Whilst ServerStream Write and Read could easily produce exceptions it is less
clear how they might be responsible for the freeze, unless the Stream was maybe
just half-open in some way when the write or read call was made ?)
PHD2.GetMessage also contains a Do Until..Loop. This seems a much more
feasible place for the job queue to freeze, if execution couldn't breakout of
the loop. The loop is designed to work with 'jsonrpc' messages to
extract the string lying between two curly brackets (e.g. "{string}"
will return "string"). Testing the loop in isolation showed that it will
loop endlessly if an open curly bracket "{" is present in the message without a
closing bracket "}". The loop appears to be purely for testing purposes
for writing out the embedded string to the logfile and doesn't serve an actual
real role since GetMessage routine has already updated it's ByRef variable
before reaching the Do Until..Loop which operates on local variables only.
Resolution of Problem
The Do Until ..Loop responsible for freezing the
job queue has been modified so that execution will always exit the loop whatever
the input. Code was tested during Session 624 and appears to have fixed the
problem
Do Until.. Loop Code ( fix is shown in bold)
bMessageEnd = False
S = Message
Do Until
bMessageEnd
i = S.IndexOf("{")
If i < 0 Then
Exit Do
End If
i2 = S.IndexOf("}")
If i2 < 0 Then
Exit Do
End If
Try
S2 = Trim(S.Substring(i,
i2 + 1))
S = S.Substring(i2 + 1)
Catch ex As Exception
Exit Do
End Try
Loop
Back to Top
Background
The Observatory Control program uses OpenPHD
/ Event Monitoring to send commands to PHD2
software and listen for messages in return. It is used to initiate
autoguiding using Guidescope / ZWO ASI178MC camera from within the Job Queue
execution after slewing to and centering targets with image exposures > 20s.
Guiding results are sent for QC and recording with the image observations.
Autoguiding is terminated before slewing to the next target.
Autoguiding was initiated by requesting PHD2 to begin looping and then to
begin guiding.
More on PHD2 Commands & Event Monitoring (https://github.com/OpenPHDGuiding/phd2/wiki/EventMonitoring)
The Problem
Whilst PHD2 would begin
looping when requested and attempt to begin to guiding when subsequently
requested by the Job Queue executor, it was found that PHD2 wasn't generally
able to find a suitable guide star to guide on. To obtain the expected quality
of images I was forced to intervene and manually select star from the Full Frame
image in PHD2 and manually click on Guide button to start guiding.
Since this is a showstopper to unattended / automated operation of the
observatory, the problem was investigated in detail.
Problem Analysis
Watching the behavior
of PHD2 during automated guiding attempts showed that it wasn't selecting a
fresh guide star for each new target before attempting to guide. PHD2 would seem
to be using the star/lock position from the previous target. The guide attempt
would generally fail because there was no star at /near the lock coordinates or
the star would be quickly lost because it was very faint. Either way the
automated guiding would fail.
Attempts to fix problem by inserting a command to 'Select Star ' between 'Looping' and 'Guide' and by adding a few seconds pause after looping starts to enable the first frame to have been taken and loaded to PHD2 before commanding Guide both failed to solve the problem
Problem Resolution
Problem was finally
fixed by directly requesting PHD2 to guide, without requesting Looping first.
This fix works . It does so because, as the OpenPHD guide states, "the
guide method allows a client to request PHD2 to do whatever it needs to start
guiding ..." PHD will
• start capturing if necessary
•
auto-select a guide star if one is not already selected
• calibrate if
necessary, or if the RECALIBRATE parameter is true
• wait for calibration to
complete
• start guiding if necessary
• wait for settle (or timeout)
•
report progress of settling for each exposure (send Settling events)
• report
success or failure by sending a SettleDone event.
Back to Top
This Web Page: | Notes - Session 623 (2018-08-31) |
Last Updated : | 2023-11-26 |
Site Owner : | David Richards |
Home Page : | David's Astronomy Web Site |