[Nitro] NASTY bug

Mark Van De Vyver mvyver at gmail.com
Fri Nov 9 20:04:01 EST 2007


On Nov 9, 2007 7:54 PM, George Moschovitis <george.moschovitis at gmail.com> wrote:
> Dear devs,
>
> I am trying to find a nasty bug in
>
> lib/raw/context/session/cookie.rb
>
> this file implements a cookie based session store, ie the session data is
> serialized to/from a cookie.
> for security we store both the serialized session data and an encrypted
> version of it (called diggest).
>
> when deserializing we check the raw data against the diggest to find out if
> the user has tampered the data.
>
> this scheme works 90%. But some times (seemingly random) the diggest check
> fails (ie  crypt(data) != diggest)
> for no apparent reason.

I don't use Nitro so I only reply because your context could involve
simultaneous disk and network activity, so your experience might
mirror mine, and it took me months to work out what it was.....
I had file copies _randomly_ fail a cmp/diff checks.
I reproduce some details below.
If I was you I'd jump straight to the kernel boot parameters, place
the disks and network under _heavy_ load and look for lost-ticks in
the
/var/log/messages.


Apparent symptom:
----------------------------
   - Files copied to the PVFS2 area might fail a diff or cmp check
(see thread below).
   - Typically this occurs when:
       a) large files are copied and
       b) several clients are copying/reading to the PVFS2 area.
   - no errors were reported in /var/log/messages (but you might see
reports about lost ticks or cpu frequency changes)

Real symptom:
----------------------
  - The disks are being placed under load when the network connection
is also under some load.

Related reports:
----------------------
 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=55223
 http://lists.linuxcoding.com/kernel/2006-q1/msg21399.html

How I diagnosed:
------------------------
 - kernel boot parameters:
    report_lost_ticks apic=debug mce=bootlog showopts

Conjectured Workaround
-----------------------------------
This allowed me to download, compile and install a new kernel.  These
boot parameters may or may not remedy the inconsistent file copy
results....
 - Add kernel boot parameter (severe and gave me boot up problems)
   noapic
 - Or, less severe, and worked for me, add:
   no_timer_check

Solution:
------------
 - Upgrade to kernel 2.6.21 (or more recent?, i.e. I'm using 2.6.21.5).
No kernel parameters need be passed, e.g. can drop the no_timer_check.

System:
------------
  - 3 sata drives arranged as 3 stripe LVM, formatted with xfs
(openSUSE10.2 defaults)
 - This may be specific to the nVidia ck804 chipset and/or the AMD
64bit processors (?)

HTH?
Mark

> I would like to really ask everyone on this list with some free time to have
> a look at the code and help me track down
> this nasty bug.
>
> thanks in advance,
> -g.
>
>
> --
> http://me.gr
> http://joy.gr
> http://cull.gr
> http://nitroproject.org
>  http://phidz.com
> http://joyerz.com
> _______________________________________________
> Nitro-general mailing list
> Nitro-general at rubyforge.org
> http://rubyforge.org/mailman/listinfo/nitro-general
>


More information about the Nitro-general mailing list