Spork Boards
Hot Spork Chat : Join us in an AIM chat room!

MacOS X a dog or just in need of a good bitch slap?

Dr Phred's Avatar Picture Dr Phred (Moderator) – December 10, 2007 10:05AM Reply Quote
Can't keep a good topic down....

-Swine Flu free since...cough, cough...

tomierna (Admin) – April 07, 2012 03:34AM Reply Quote
Hideously Unnatural
Beach balls happen when io is wedged.

IO doesn't necessarily mean RAM, though it definitely does when you are swapping a lot.

Knowing if it is just a program or the whole UI that is affected, or in the worst cases, the whole system, including the CLI and network, can be helpful in figuring out what part of IO is wedged.

For instance, if you are beach balling just within Safari or Camino, and can get to activity monitor, click on the (Not Responding) process and sample it. You can usually pick out the calls it's hanging in by name, and they will sometimes help you understand what part of the program is to blame. In my experience with browsers, this is usually a plugin or extension which needs updating, although in some cases it may be something system wide which only exhibits when an app hits it, like a bad QuickTime codec.

If it is progressive, as in, one program is unresponsive, and you switch to another, and it quickly becomes unresponsive, and you are able to rule out swap, it's likely the disk IO is blocked. I've found a restart is usually needed to fix this, and it only usually happens when a badly coded process is leaking open file handles. Hard to fix these, except to replace the program in question, or have the developer fix it.

Neither of these scenarios are limited to MacOS, but they exhibit in different ways in different OSs.

ddt – April 07, 2012 05:19AM Reply Quote
Tom, how does one pick out the calls by name? I can't tell what I'm looking at in the Sample... .

ddt

tliet – April 07, 2012 05:22AM Reply Quote
i/o bottleneck has been solved in my machine by the ssd ;-D

porruka (Admin) – April 07, 2012 07:14AM Reply Quote
Just to hit and run, not all page faults are bad because not all actually hit the disk. There are memory-mapped files, there are pages locked into place, there are programming methodologies that make freeing available memory easier (or harder) for the OS to manage. The specifics of any one case are unlikely to bear a useful relationship to another, honestly.

If any of the geeks in the crowd want to *really* know what's going on, install the dev tools and run Instruments, or go into bare-metal mode and dive into DTrace. I've been having to mess with this lately and it's an amazing tool if you can get your brain around it. The official book is only 1100ish pages...

tomierna (Admin) – April 07, 2012 07:40AM Reply Quote
Hideously Unnatural
Can you copy/paste the top few lines in here, ddt?

ddt – April 07, 2012 10:40AM Reply Quote
Thanks, Tom.

Here (w/o reboots in between) is Camino:

Sampling process 4372 for 3 seconds with 1 millisecond of run time between samples
Sampling completed, processing symbols...
Analysis of sampling Camino (pid 4372) every 1 millisecond
Process: Camino [4372]
Path: /Applications/Camino (2111.09.08).app/Contents/MacOS/Camino
Load Address: 0x1000
Identifier: org.mozilla.camino
Version: 2.1.2 (2112.03.08)
Code Type: X86 (Native)
Parent Process: launchd [240]

Date/Time: 2012-04-07 12:13:30.763 -0700
OS Version: Mac OS X 10.7.3 (11D50)
Report Version: 7

Call graph:
2257 Thread_587792: Main Thread DispatchQueue_
+ 2257 start (in Camino) + 41 [0x3339]
+ 2257 _start (in Camino) + 216 [0x3412]
+ 2257 main (in Camino) + 254 [0x3b8e]
+ 2257 NSApplicationMain (in AppKit) + 1054 [0x92241261]
+ 2248 -[NSApplication run] (in AppKit) + 911 [0x91fad675]
+ ! 2244 -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] (in AppKit) + 113 [0x91fb1306]
+ ! : 2240 _DPSNextEvent (in AppKit) + 678 [0x91fb1a9c]
+ ! : | 2240 BlockUntilNextEventMatchingListInMode (in HIToolbox) + 88 [0x9036e356]
+ ! : | 2211 ReceiveNextEventCommon (in HIToolbox) + 381 [0x9036e4e7]
+ ! : | + 2210 RunCurrentEventLoopInMode (in HIToolbox) + 318 [0x9036717f]
+ ! : | + ! 2210 CFRunLoopRunInMode (in CoreFoundation) + 120 [0x90048328]
+ ! : | + ! 2206 CFRunLoopRunSpecific (in CoreFoundation) + 332 [0x9004847c]
+ ! : | + ! : 1980 __CFRunLoopRun (in CoreFoundation) + 1428 [0x90048da4]
+ ! : | + ! : | 1980 __CFRunLoopServiceMachPort (in CoreFoundation) + 170 [0x9003fc7a]
+ ! : | + ! : | 1980 mach_msg (in libsystem_kernel.dylib) + 70 [0x958ce1f6]
+ ! : | + ! : | 1980 mach_msg_trap (in libsystem_kernel.dylib) + 10 [0x958cec22]
+ ! : | + ! : 161 __CFRunLoopRun (in CoreFoundation) + 1112 [0x90048c68]
+ ! : | + ! : | 161 __CFRunLoopDoSources0 (in CoreFoundation) + 246 [0x9001ed96]
+ ! : | + ! : | 161 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ (in CoreFoundation) + 15 [0x9001f3df]



And here is Safari:

Call graph:
2343 Thread_873038 DispatchQueue_1: com.apple.main-thread (serial)
+ 2343 ??? (in Safari) load address 0x1039af000 + 0xf24 [0x1039aff24]
+ 2343 SafariMain (in Safari) + 197 [0x7fff91d7d48d]
+ 2343 NSApplicationMain (in AppKit) + 867 [0x7fff95836b88]
+ 2343 -[NSApplication run] (in AppKit) + 470 [0x7fff955b819d]
+ 2343 -[BrowserApplication nextEventMatchingMask:untilDate:inMode:dequeue:] (in Safari) + 171 [0x7fff91bc9165]
+ 2343 -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] (in AppKit) + 135 [0x7fff955bb861]
+ 2343 _DPSNextEvent (in AppKit) + 659 [0x7fff955bbf5d]
+ 2343 BlockUntilNextEventMatchingListInMode (in HIToolbox) + 62 [0x7fff992cc456]
+ 2343 ReceiveNextEventCommon (in HIToolbox) + 355 [0x7fff992cc5c9]
+ 2343 RunCurrentEventLoopInMode (in HIToolbox) + 277 [0x7fff992c531f]
+ 2343 CFRunLoopRunSpecific (in CoreFoundation) + 230 [0x7fff96390676]
+ 2321 __CFRunLoopRun (in CoreFoundation) + 1204 [0x7fff96390e64]
+ ! 2321 __CFRunLoopServiceMachPort (in CoreFoundation) + 188 [0x7fff963886fc]
+ ! 2321 mach_msg (in libsystem_kernel.dylib) + 73 [0x7fff98dedd71]
+ ! 2321 mach_msg_trap (in libsystem_kernel.dylib) + 10 [0x7fff98dee67a]
+ 18 __CFRunLoopRun (in CoreFoundation) + 1617 [0x7fff96391001]
+ ! 18 __CFRunLoopDoTimer (in CoreFoundation) + 534 [0x7fff963b0776]
+ ! 18 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ (in CoreFoundation) + 20 [0x7fff963b0c24]



And here is Firefox, only a blank tab showing (after quitting Safari):

Call graph:
2523 Thread_877530 DispatchQueue_1: com.apple.main-thread (serial)
+ 2523 start (in firefox) + 52 [0x100001434]
+ 2523 start (in firefox) + 2322 [0x100001d12]
+ 2523 XRE_main (in XUL) + 12438 [0x101017ec6]
+ 2523 js::SecurityWrapper::~SecurityWrapper() (in XUL) + 614580 [0x101bb40e4]
+ 2523 JSD_DebuggerOnForUser (in XUL) + 707389 [0x101d4fbcd]
+ 2523 -[NSApplication run] (in AppKit) + 470 [0x7fff955b819d]
+ 2523 JSD_DebuggerOnForUser (in XUL) + 706386 [0x101d4f7e2]
+ 2523 -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] (in AppKit) + 135 [0x7fff955bb861]
+ 2523 _DPSNextEvent (in AppKit) + 659 [0x7fff955bbf5d]
+ 2523 BlockUntilNextEventMatchingListInMode (in HIToolbox) + 62 [0x7fff992cc456]
+ 2523 ReceiveNextEventCommon (in HIToolbox) + 355 [0x7fff992cc5c9]
+ 2523 RunCurrentEventLoopInMode (in HIToolbox) + 277 [0x7fff992c531f]
+ 2523 CFRunLoopRunSpecific (in CoreFoundation) + 230 [0x7fff96390676]
+ 2522 __CFRunLoopRun (in CoreFoundation) + 1204 [0x7fff96390e64]
+ ! 2522 __CFRunLoopServiceMachPort (in CoreFoundation) + 188 [0x7fff963886fc]
+ ! 2522 mach_msg (in libsystem_kernel.dylib) + 73 [0x7fff98dedd71]
+ ! 2522 mach_msg_trap (in libsystem_kernel.dylib) + 10 [0x7fff98dee67a]
+ 1 __CFRunLoopRun (in CoreFoundation) + 1216 [0x7fff96390e70]
+ 1 pthread_mutex_lock (in libsystem_c.dylib) + 480 [0x7fff97640160]
+ 1 OSAtomicCompareAndSwap64Barrier$VARIANT$mp (in libsystem_c.dylib) + 0 [0x7fff97687544]


ddt

Alan Lehman – April 07, 2012 01:13PM Reply Quote
Quote
Cloudscout
Quote
Alan Lehman
Uhhhh. No? 10.6 and every preceding version will page itself into beach ball unresponsiveness long before it'll willingly give up inactive RAM. Sorry, but I live with that every day. 'purge' on the command line helps slightly but not much. From what I hear from my coworkers, Lion isn't any better in that regard.

That's not my experience.

I just did a test.

Here's my experience just a few minutes ago.
TL;DR Fresh boot, VM at 140G. Half an hour later, the VM is up 33G, I am actively paging and inactive RAM is 6.1GB/8GB total.

Fresh boot plus Safari, Terminal and TextWrangler. Postgres bash and perl on the command line. This stays the same throughout.
1:49 PM
PhysMem: 704M wired, 498M active, 270M inactive, 1472M used, 6718M free.
VM: 140G vsize, 1043M framework vsize, 65842(2) pageins, 0(0) pageouts.
Networks: packets: 7057/2907K in, 3102/515K out.
Disks: 27572/957M read, 18909/217M written.

2:10 PM
MemRegions: 9257 total, 401M resident, 22M private, 227M shared.
PhysMem: 703M wired, 767M active, 4888M inactive, 6357M used, 1833M free.
VM: 164G vsize, 1043M framework vsize, 140756(0) pageins, 95(0) pageouts.
Networks: packets: 32212/7284K in, 18966/2448K out.
Disks: 59108/18G read, 26787/1771M written.

2:13 PM
MemRegions: 9271 total, 443M resident, 22M private, 231M shared.
PhysMem: 701M wired, 830M active, 5477M inactive, 7008M used, 1178M free.
VM: 164G vsize, 1043M framework vsize, 140943(0) pageins, 1952(0) pageouts.
Networks: packets: 35190/7781K in, 20544/2674K out.
Disks: 77455/32G read, 28180/2095M written.

2:17 PM
MemRegions: 9952 total, 908M resident, 23M private, 242M shared.
PhysMem: 700M wired, 1286M active, 6161M inactive, 8147M used, 42M free.
VM: 173G vsize, 1043M framework vsize, 141306(0) pageins, 4004(32) pageouts.
Networks: packets: 40700/8367K in, 23910/3060K out.
Disks: 90068/40G read, 31883/2980M written.

6161M of inactive RAM and 33 GB of new swap. Seriously?

I'm still working. VM peaked around 191GB and inactive RAM did drop. Eventually.

porruka (Admin) – April 07, 2012 02:26PM Reply Quote
You're having very few pageouts, so while the swap allocation is growing (because of the database, perhaps?) it doesn't appear from these snapshots that you're actually thrashing/swapping to disk.

Alan Lehman – April 07, 2012 03:52PM Reply Quote
True. Paging isn't very intense at this point. The beach balling started later but the VM stayed below 200 so paging wasn't all that bad today. Still, VM up 33, inactive >75% of my RAM. That frustrates me. As I understand it, inactive is RAM that's reserved for things that might come back so that they can relaunch faster. But I'm running shell scripts that parse large data sets with lots of command line tools. I suspect that all of my scripts/tools are ending up in inactive despite the fact that they won't get called again.

porruka (Admin) – April 07, 2012 05:32PM Reply Quote
Without even knowing any more, Alan, I'd agree that the usage you describe will grow the "inactive" and could fool the recovery algorithms.

Like I mentioned, I've been digging into low-level performance and measuring tools lately. Something that you should try (that I found today) is vmmap(1).

Try on the commandline:

vmmap -w -resident [some processname]

where [some processname] might be (case-sensitive) PostrgreSQL, bash, or (USE WITH CARE) Safari.

I actually managed to hang Safari PPC with it (Leopard) until I ctrl-c'd the attempt, but it worked fine on other processes. And if you get into beachballin', check out Spin Monitor in Instruments to start the collection to identify why...



Edited 1 time(s). Last edit at 04/07/2012 05:35PM by porruka.

El Jeffe – April 08, 2012 06:07AM Reply Quote
What a journey.
Quote
ddt
There's only one way we can settle this, gentlemen: Call in Siracusa.

ddt

I was just in a RL discussion with a guy I work with (an ArsTechnica founder) that mentioned Siracusa.

tomierna (Admin) – April 08, 2012 08:52AM Reply Quote
Hideously Unnatural
ddt, were these samples done when the apps were beach balling?

ddt – April 08, 2012 09:18AM Reply Quote
Tom,

No. There was some delay when app switching or selecting tabs, but no beachballs.

ddt

Alan Lehman – April 09, 2012 06:40AM Reply Quote
Quote
porruka
Try on the commandline:

vmmap -w -resident [some processname]

Whoa, that's a mouth full of output. I'll give that a good try and see what I can learn.

tomierna (Admin) – April 09, 2012 11:04AM Reply Quote
Hideously Unnatural
All of these tools are only terribly helpful when the beachballing is happening - you're looking to see what calls and libraries the app is running at the time of the hang, so you can see what's wedged.

Next time it happens, if you want to copy-paste the whole trace into an e-mail and send it to me, I'll give a shot at giving it a post mortem.

ddt – April 11, 2012 04:55PM Reply Quote
Wow, just had an ever weirder crash. Camino started being unresponsive, so I tried to sample from Activity Monitor. That stalled out (the progress bar on the sample window stopped moving for minutes), with Camino taking over 100% of one CPU. Then VLC started taking up a lot of RAM and CPU, so I tried to quit it. It too became Not Responding. So I Force Quit VLC. Which removed the icon from the Dock, and VLC from the menu bar and Activity Monitor. Yet the movie clip kept playing. But the VLC Main Window controls wouldn't respond to clicks or keyboard commands. It was like a ghost app, playing along. So I tried logging out. All apps quit, gray screen... except for the VLC clip kept playing, with a spinning beachball when the cursor went over it! I let it go for a few minutes, then force restarted. Yay?

ddt

tomierna (Admin) – April 13, 2012 05:26AM Reply Quote
Hideously Unnatural
I didn't see anything in the System Profiler that disturbed me, except for one thing - your system.log had an I/O error in it, and it was the mds (Spotlight) process reporting it.

System Profiler report only shows a small portion of system.log, so you might look through the console app's view of the system.log and check for I/O or "io " errors.

You may have a failing disk.

If this is the case, please consider upgrading to an SSD when replacing it!

El Jeffe – May 12, 2012 04:36AM Reply Quote
What a journey.
I would like to be able to right-click on the a(ny) message in the console, and "find answers at Apple Support" or "Submit to Apple Support" (most likely, the forums; not actual support/case management peoples).

Is that too much to ask for? (Or do we need more iOS features --> fart apps)

ddt – May 14, 2012 01:57PM Reply Quote
Well, the top-listed change to Safari 5.1.7 (just released) is "Improve the browser's responsiveness when the system is low on memory".

ddt

ddt – July 25, 2012 03:59AM Reply Quote
DPBD:

Okay, who's got Mountain Lion already and feedback? I'd shell out US$20 for better memory management under the hood alone.

ddt

Sorry, only registered users may post in this forum.

Click here to login