From: | Dan Scott-Raynsford |
To: | Brett Baggott |
And what you are describing, where you can get wwDemo to work but your's still won't, would make sense if you had bogus server entries.
I wouldn't recommend it but if you were really at wit's end you could always change the information on the Servers tab of the Project Information page and then choose to "Regenerate Component Ids" on the Build menu. A better option would be to clean out all the entries and start from scratch (like in the docs) because if you change the ProgID you're going to have to completely re-register the COM servers again. Still, at least you'd know.
Hi Rick,
<b>Ensure COM Server Works Altogether on the Server<b>
I tried invoking the DCOM server via a VFP Shell via:
loServer=CREATEOBJECT("platoatlas.platoatlasserver")
However, VFP stops responding at this point for 2 minutes. During this time the PLATOAtlas.exe (DCOM server) can be seen in the Task Manager running under the user account specified in DCOM. After about 2 minutes the following error is shown in VFP:
<b>OLE error code 0x80080005: Server execution failed</b>
I built a version of wwdemo.exe and uploaded that to this test server, registered with DCOM and set up the identity and DCOM permissions the same was as our DCM server and tried loading with:
loServer=CREATEOBJECT("wwdemo.wcDemoServer")
This created the Demo server correctly and I was able to interact with it just fine.
<b>Check for hangs in your Startup Code</b>
This would seem to indicate a problem with my start up code somewhere. But to check that the wwserver.init method was actually firing I added logging to it (in my platoatlasmain.prg):
<code lang="vfp"
DEFINE CLASS PLATOAtlasServer AS WWC_SERVER OLEPUBLIC
FUNCTION INIT()
DO wwUtils.prg
LogString("WWSERVER::INIT() Fired")
DODEFAULT()
ENDFUNC
When running in File Mode the string is written to the log file. When running in DCOM mode (or created in a VFP Shell) on this server the log entry is _not_ written (nor the file created).
I would have expected that this should have fired before anything else (any of my startup code) - or is that not correct? I'd really like to verifiy that the some/any VFP code is actually executing in the server when running as DCOM. If I can confirm that at least VFP code is being run in the server then I can begin adding logging functions. But if VFP code is never even being called then adding logging functions won't get me far!
So although we're definitely making progress tracking down the issue it's still a bit of a mystery!
Thanks
Dan
Dan,
Brett's recommendation are good.
A couple of more things that might help.
Ensure COM Server Works Altogether on the Server
Make sure the COM server can be invoked on that machine at all. Try calling it from VFP or VB script or Powershell script to make sure the server can be invoked at all on that machine.
loServer = CREATEOBJECT("wcDemo.wcDemoServer")*** One of the two below to fire a method ? loServer.ProcessHit("QUERY_STRING=wwDemo~TestPage") ? loServer.ProcessHit("PHYSICAL_PATH=c:\westwind\wconnect\testpage.wwd")
If this produces a response you know the server is OK and the issue is one of permissions.
Set DCOM to use Caller's Permissions and inherit Application Pool Identity
To be sure I would configure DCOM impersonation to a local machine account first. Try SYSTEM (which should almost always work unless you have resources that live on other machines) or a local machine admin account that has full rights to everything. If possible call a request that is a 'do nothing' 'helloworld' type of request to minimize startup issues that might be crashing the server due to resource issues.
Another option: The DCOM servers inherit permissions from the Application Pool if you set the DCOM permissions to 'The launching user.'. Typically this will be SYSTEM or whatever the AppPool is configured for. By leaving it at 'the launching user' and setting the permissions of the Application Pool in IIS you are removing the requirement to configure DCOM altogether.
When you say the server starts but hangs, what account does it show up under in TaskManager/Process Explorer? Is it the right account? This still doesn't solve it all - since Launch and Access permissions are assigned seperately in DCOM config.
Check for hangs in your Startup Code
If the COM object loads, but apparently hangs before a request fires it's quite possible that your startup code (OnLoad/OnInit) has code in it that is failing. One thing to try to see if that's the problem is remove all your startup code and put it into separate methods. Remove the calls to those methods and then hit a simple HelloWorld request.
Does that work? If it does, then there's a problem in the startup code that might be the result of missing permissions of some sort.
Logging Startup Code
Finally if all else fails, try logging in the server startup code. Put calls to LogString() (from wwUtils) into the server's initialization methods - one at the top one at the end of each. OnInit(), OnLoad() and Process(). This should tell you wether these methods are getting fired at all.
Make sure that the location you write the logging file to has permissions to allow the app to write to it. Allow the specific account that you have DCOM/ApplicationPool configured and just to be sure add Everyone.
You'll want to dial back the permissions once you have it running.
If you want I can also take a look at your server and see if I can find anything that doesn't look right. You can contact me here: http://west-wind.com/contact.aspx.
Finally, I know you have 4.68 servers running. There have been a lot of improvements in 5.x in terms of making the security environment a bit easier. As Brett mentioned the managed module handler actually makes a lot of things easier and provides better error reporting among other things.
Not sure if DCOM issues will necessarily be any better, but it might be a good idea going forward to look into moving to version 5.0 just to keep up with the latest versions and fixes... the change over is not 100% transparent, but the changes are isolated to a few system related required updates that can be changed globally in a few key places.
+++ Rick ---
Hi Brett,
Thank you very much - that's awesome advice. I haven't tried the HTTP Handler Module - Is it a feature of WWC5.0? We're only running WWC4.68 at this client. However I'd love to do away with DCOM - when it works it's great but when it goes bad it's usually a complete mystery! So I think I'll look into this HTTPHanlder.
We are fairly certain the fault lies in DCOM. We even got a brand new server that has never our WWC servers installed and installed the system from scratch. It exhibited the exact same problems. We are thinking there is something wrong with a Windows Security Policy or the Windows Account Security - as that appears to be the only thing connecting all the 4 servers. It is of course possible that some MS Patch or other OS software is causing the problem but that's beyond our expertise. The client has pretty much now bought in their top people to try and track this down (with network sniffers and other such tools). File Mode is our only saving grace here (the servers run flawlessly in File Mode).
So fingers crossed the super technical guys can spot something we're missing!
Thanks again
Dan
Dan, I've been using WWC for about 10 years and I've also encountered some odd DCOM problems. My normal procedure for troubleshooting is similar to what you've already done but just in case it might throw up some light bulbs, I'll mention it.
First, before doing anything, I put the servers in file mode and make sure that works. You'd be surprised the number of times just doing that has highlighted an issue I was sure wasn't a problem.
Assuming file mode works fine, my next step is somewhat new but it has proven invaluable. That is, I use the new Http Handler Module (I've actually moved to this permanently based on my problems with DCOM).
Assuming the Http Handler Module works, that means we definitely do have a DCOM issue (even though I might "know" this based on the error, I still follow the process).
Next, I double check all my identities and locations making sure my .ini files line up with each other.
Next I do what you said you've done and unreg / re-reg the server, remove all DCOM entries and create them from scratch.
Then, because it's got me SO MANY times, I make sure to IISRESET, _then_ I see if everything starts working.
It sounds like you've covered all this so far but are still having problems. My one final tip to you is to at least try the Http Handler Module. It's made my life so much easier.
Hi Rick,
I've checked that the AppPool is running under Local System (the highest account). This is running on a W2K3 Server SP2.
The WC.DLL is set to Allowed in IIS Web Service Extensions.
Going to the ShowStatus page the Current Login is SYSTEM.
We've unregistered the DCOM servers and re-registered them. We've even deleted the DCOM registry entries for the server (didn't like doing this) and re-registered them.
In DCOM mode the servers start up under the correct account (as per the DCOM identity page) - we can see them in task manager. They sit there using about 6MB memory. They ignore all commands to End Process (e.g. they don't terminate). After about 2-3 minutes the servers seem to end on their own and then the "An internal exception occurred in the call to LoadServers (COM)" browser message appears - although sometimes the message is "An internal exception occurred in the ISAPI application".
We've disabled virus scanners on the servers. The client says that they have changed nothing on these server and no patches or updates have been installed.
We know something must have changed (three independent servers at the same facility don't all of a sudden have their DCOM fail at the same time) but we're at a loss as to what (and the client doesn't seem to know).
I've been using WWC for many years and although I've run into the occasional glitch with DCOM, I've never encountered something like this. Usually re registering the servers and reconfiguring DCOM and restarting IIS does the trick. But this time we've had 4 people working on the issue for 3.5 days now and just can't figure it out! I just hope it's not some obscure MS patch that some how breaks DCOM with VFP/WWC because we've got lots of other clients with this same setup and would hate to think they're one patch away from this.
Any insight you've got here would be greatly appreciated!
Thanks
Dan
Hmmm... this is an unhandled exception that didn't throw back COM error codes in that case, which is odd. LoadServers definitely should only fail if there's a COM error.
Can you make sure that you're running the right module (ie. either ISAPI or .NET Module) and that it's configured properly for the COM server (ie. has right prog ids).
Also make sure the IIS AppPool is running with a high access account and that whatever account that is hasn't changed either. If you go to:
wc.wc?_maintain~ShowStatus (or use whatever scriptmap is configured)
you should see all the actual logon account information that the app is running under. If necessary switch to file mode first if there are problems making it this far if the app gets hit in the meantime.
+++ Rick ---
Correction - the message in the browser is:
An internal exception occurred in the call to LoadServers (COM)
Hi Rick,
Thanks for your response. Our first thoughts were also the AD account changing in some way. So we got a new account created and reset the DCOM security with it - to no avail. I haven't tried setting the default access permissions yet but I'll do that now.
The browser error message that is being shown (I've never seen this before):
Web Connection Error
An internal exception in the call to LoadServers (COM)
Thanks
Dan
Hi Dan,
It sounds like the permissions on the AD account might have changed. If there was a password change or anything else about that account has changed you'll have to re-apply the DCOM Impersonation to make sure the account is still linked.
Other than that it sounds like the Launch permissions are working, but the Access permissions are not. So check your global DCOM settings and make sure that the account in question has rights to access DCOM components (It's in the Computer level COM+ Properties I believe).
To rule out problems it might be useful to launch the servers with SYSTEM or a local ADMIN account rights to see if permissions indeed are the problem. That should tell you right away...
What does the error message say in the browser when this fails? There should be a COM Error code plus a message that shows up in the browser if it's a DCOM load error.
+++ Rick ---
Hi Rick,
We have a client who have 3 servers - each running the same WWC 4.68 intranet servers (on 32-bit W2K3 Servers). There are two live servers and one test server.
About 2 days ago all 3 DCOM servers stopped loading properly or responding. The client _claims_ that no changes have been made to any of the 3 systems or the related databases. The WWC DCOM servers have been operating perfectly there for several months.
The DCOM settings have been completely reset on all 3 servers. The DCOM servers have been /unregistered and then /registered. A new AD account was also created and assigned to the DCOM identity. The Security of the DCOM servers has been set to Everyone has Local+Remote Launch and Everyone has Local+Remote Access permissions.
The servers operate _perfectly_ in File Mode when running under the same AD account that is assigned to the DCOM. In COM mode the servers do appear to start (they show up in Task Manager) - but they have a much smaller than expected memory footprint and do not respond and cannot be terminated using End Process. These unresponsive DCOM servers seem to shut automatically after a while but this appears to be because wc.dll is terminating them because they are timing out.
IIS has been restarted and the servers have been fully rebooted. The servers are 3 separate machines. Currently we have the Live system operating off a single machine running multiple servers in File Mode. The 2nd live machine is available to us for testing/checking.
These exact same servers and setups are in use at about 7 other sites without this issue.
The wcErrors.txt shows:
2012-11-20 07:39:13:888 Web Connection Request timed out. - ?atlas~start - 0
2012-11-20 08:32:28:836 Web Connection Request timed out. - ?atlas~start - 2
2012-11-20 08:32:28:836 Web Connection Request timed out. - ?atlas~start - 0
2012-11-20 08:39:02:138 An exception occurred in WC.DLL: _maintain~Load
Unloading all servers and reloading... - ?_maintain~Load - 0
2012-11-20 09:45:52:488 Exception in Loading Servers (COM) - 1008
2012-11-20 09:49:12:069 An exception occurred in WC.DLL: atlas~worksheetsavedoc~899219
Unloading all servers and reloading... - ?atlas~worksheetsavedoc~899219 - 0
2012-11-20 09:53:13:072 Exception in Loading Servers (COM) - 1008
2012-11-20 09:55:49:121 An exception occurred in WC.DLL: atlas~worksheetsavedoc~899219
Unloading all servers and reloading... - ?atlas~worksheetsavedoc~899219 - 0
2012-11-20 09:57:49:123 An exception occurred in WC.DLL: atlas~ps~platomain
Unloading all servers and reloading... - ?atlas~ps~platomain - 0
We've exhausted all our expertise on this. Do you have any thoughts on this and/or would you be available to assist the client at what ever your normal rates were (assuming you were available to do this)?
Thanks
Dan
Hi Brett,
Thanks again for all your suggestions!
We had actually assumed the issue could be broken DCOM registration so I actually did completely blow away the DCOM registration (/unregserver followed by manually purging the Prog Ids from the registry). But this didn't fix it this time. I've run into this sort of thing before and purging (via the registry) and reregistering the DCOM usually does the trick.
It's also odd that this issue has occurred on 3 completely seperate servers at the same facility all at the same time. A 4th server at the facility was built and after installing the server from scratch exhibits the exact same problem. So we've got 4 distinct servers all doing the same thing.
The actually server executables haven't changed since July 2012 so it couldn't be a new version etc. In fact, we can't see any changes that occurred that could have suddenly made all the servers stop like this. A broken database or bad supporting file should show up in File Mode or at least the servers should have started up enough to write a simple log entry.
A real head scratcher to be sure!
Thanks
Dan